User ID:
Password:
Remember Me:
Forgot Password?
Not a member?
Click here for more information and to register.

    Parsing HTML without Using the Browser Control



    How to Use MS HTML as a HTML Parser in Visual Basic Without Using the Browser Control.

    .



    Click here for larger image

    Environment: VB6 SP5, XPPro, IE6

    The main goal of this article is to provide a way to use the HTML parser inside Microsoft Internet Explorer within your program.

    This is something usually easy if you use the browser control. There are plenty of examples on the Internet, but when it comes to using it in a UI-less way, there is nothing done in Visual Basic. All examples I've seen are in Visual C++ using interfaces that are not available in Visual Basic.

    After days of trying to find a way, trying the .NET platform to be able to use an HTML parser in a Windows NT service, I finally found a way. I don't claim this is the nicest way to do it, but it works like a charm, and you have access to the DOM of the HTML document you want, which can be very useful if you're looking to parse a HTML document.

    Your code must have a reference to the Microsoft HTML Object Library. Internet Explorer 5 or more is required to do this. Simply copy this code in any function.

    Dim objLink As HTMLLinkElement
    Dim objMSHTML As New MSHTML.HTMLDocument
    Dim objDocument As MSHTML.HTMLDocument
    
    
    ' This function is only available with Internet Explorer 5
    
    Set objDocument = objMSHTML.createDocumentFromUrl(txtURL.Text, _
                                                      vbNullString)
        
    ' Tricky, to make the function wait for the document to 
    ' complete, usually the transfer is asynchronous. Note 
    ' that this string might be different if you have another
    ' language than English for Internet Explorer on the
    ' machine where the code is executed.
    
    While objDocument.readyState <> "complete"
        DoEvents
    Wend
    
    ' Source Code
    
    Debug.Print = objDocument.documentElement.outerHTML
    
    ' Title
    
    Debug.Print "Title : " & objDocument.Title
    
    ' Link Collection
    
    For Each objLink In objDocument.links
        lstLinks.AddItem objLink
        Debug.Print "Link:  " & objLink
    Next
    

    Downloads

    Download demo project - 3 Kb

    IT Offers


    Top Authors