用VB编写异步多线程下载程序
大庆油田有限公司勘探开发研究院网络室 满孝
为了高效率地下载某站点的网页,我们可利用VB的Internet Transfer 控件编写自己的下载程序, Internet Transfer 控件支持超文本传输协议 (HTTP) 和文件传输协议 (FTP),使用 Internet Transfer 控件可以通过 OpenURL 或 Execute 方法连接到任何使用这两个协议的站点并检索文件。本程序使用多个Internet Transfer 控件,使其同时下载某站点。并可判断文件是否已下载过或下载过的文件是否比服务器上当前的文件陈旧,以决定是否重新下载。所有下载的文件中的链接都做了调整,以便于本地查阅。OpenURL 方法以同步方式传输数据。同步指的是传输操作未完成之前,不能执行其它过程。这样数据传输就必须在执行其它代码之前完成。而 Execute 方法以异步方式传输数据。在调用 Execute 方法时,传输操作与其它过程无关。这样,在调用 Execute 方法后,在后台接收数据的同时可执行其它代码。用 OpenURL 方法能够直接得到可保存到磁盘的数据流,或者直接在 TextBox 控件中阅览(如果数据是文本格式的)。而用 Execute 方法获取数据,则必须用 StateChanged 事件监视该控件的连接状态。当达到适当的状态时,调用 GetChunk 方法从控件的缓冲区获取数据。 首先,建立启始的http检索连接,Public g As VariantPublic k As VariantPublic spath As StringDim links() As Stringg = 0spath = 本地保存下载文件的路径links(0)=启始URLinet1.execute links(0), "GET" '使用GET方法。 事件监控子程序(每个Internet Transfer 控件设置相对应的事件监控子程序):用StateChanged 事件监视该控件的连接状态, 当该请求已经完成,并且所有数据均已接收到时,调用 GetChunk 方法从控件的缓冲区获取数据。Private Sub Inet1_StateChanged(ByVal State As Integer)'State = 12 时,使用 GetChunk 方法检索服务器的响应。Select Case State'...没有列举其它情况。 Case icResponseCompleted '12'获取links(g)中的协议、主机和路径名。addsuf = Left(links(g), InStrRev(links(g), "/"))'获取links(g)中的文件名。fname = Right(links(g), Len(links(g)) - InStrRev(links(g), "/"))'判断是否是超文本文件,是超文本文件则分析其中的链接,若不是则存为二进制文件。If InStr(1, fname, "htm", vbTextCompare) = True Then'初始化用于保存文件的FileSystemObject对象。Set fs = CreateObject("Scripting.FileSystemObject")Dim vtData As Variant '数据变量。Dim strData As String: strData = ""Dim bDone As Boolean: bDone = False '取得第一块。vtData = inet1.GetChunk(1024, icString)DoEventsDo While Not bDonestrData = strData & vtDataDoEvents'取得下一块。vtData = inet1.GetChunk(1024, icString)If Len(vtData) = 0 ThenbDone = TrueEnd IfLoop '获取文档中的链接并置于数组中。Dim i As VariantDim po1 As VariantDim po2 As VariantDim oril As StringDim newl As StringDim lmtime, ctimepo1 = InStr(1, strData, "href=", vbTextCompare) + 5po2 = 1Dim newstr As String: newstr = ""Dim whostr As String: whostr = ""i = 0Do While po1 > 0newstr = Mid(strData, po2, po1)whostr = whostr + newstrpo2 = InStr(po1, strData, ">", vbTextCompare)'将原链接改为新链接oril = Mid(strData, po1 + 1, po2 - po1 - 1)'如果有引号,去掉引号ln = Replace(oril, """", "", vbTextCompare)newl = Right(ln, Len(ln) - InStrRev(ln, "/"))whostr = whostr & newlIf ln <> "" Then'判定文件是否下载过。If fileexists(spath & newl) = False Thenlinks(i) = addsuf & lni = i + 1Elselmtime = inet1.getheader("Last-modified")Set f = fs.getfile(spath & newl)ctime = f.datecreated'判断文件是否更新If DateDiff("s", lmtime, ctime) < 0 Theni = i + 1End IfEnd IfEnd Ifpo1 = InStr(po2 + 1, strData, "href=", vbTextCompare) + 5Loopnewstr = Mid(strData, po2)whostr = whostr + newstr Set a = fs.createtextfile(spath & fname, True)a.Write whostra.Closek = iElseDim vtData As VariantDim b() As ByteDim bDone As Boolean: bDone = FalsevtData = Inet2.GetChunk(1024, icByteArray)Do While Not bDoneb() = b() & vtDatavtData = Inet2.GetChunk(1024, icByteArray)If Len(vtData) = 0 ThenbDone = TrueEnd IfLoopOpen spath & fname For Binary Access Write As #1Put #1, , b()Close #1End IfCall devjob '调用线程调度子程序End Select End Sub Private Sub Inet2_StateChanged(ByVal State As Integer)...end sub ... 线程调度子程序,g和是k公用变量,k为最后一个链接的数组索引加一,g初值为零,每次加一,直到处理完最后一个链接。Private Sub devjob() If Not g + 1 < k Then GoTo reportlineIf Inet1.StillExecuting = False Theng = g + 1Inet1.Execute links(g), "GET"End IfIf Not g + 1 < k Then GoTo reportlineIf Inet2.StillExecuting = False
[1] [2] 下一页