Originally posted by: Neil Stansbury
All well and good, except it's a shame MSHTML doesn't produce standards compliant code. Run document.documentElement.outerHTML past the Validator at the W3C, it produces so many errors it's not funny.
Reply
Originally posted by: jlw
..Been looking for a good example for a while... THANKS!
ReplyOriginally posted by: Steve
I parse a HTML page of online users and no longer I need to look at that page, your App tells me who's online. Great thanks!
Reply
Originally posted by: kevin
I am trying to gain understanding and experience with bots and such. I think this parser is exactly one of the things I need, but I am having some difficulty making it work in .NET 2003.
Here is my issue:
I have included the necessary reference to the mshtml library, but when I try to run the program I get this error on this line:
objDocument = objMSHTML.createDocumentFromUrl("http://www.yahoo.com";, vbNullString)
The error is:
An unhandled exception of type 'System.NullReferenceException' occured in mscorlib.dll
Additional information: Object reference not set to an instance of an object.
So, am I doing something totally wrong?
I am running XP Pro and .NET 2003. Any suggestions would be greatly appreciated. Thanks.
ReplyOriginally posted by: Mowgli
I've been trying to use the MSHTML object to parse HTML based on a string (from a database) rather than a fetched URL. None of the properties of the HTMLDocument class seem to let me set a custom source. Does anyone know if it's possible?
I'm also looking for a solution to this problem. Could you let me know once you find it? Thanks, codeguru.com (at) martijn.coppoolse.com
ReplyOriginally posted by: Web
This is the best and most succint example of a web parser.
With this I will rule the world.
Reply
Originally posted by: V.Thandava Krishna
Hello,
I am getting below said error message when i ran the above code. I am attaching here with the error and code. Please help me how to get out of this problem.
I am getting this error for only google.com site. As best of my knowledge, in body onload event google set the focus to the text field.
How can I disable all the events in document. Because of I am using this component in ASP. My requirement is to get the content of the web page and extract the info. from it and store in our database.
I can also get the content using WinINET control. But I want the html code after executing the client side script.
If is there any alternate solution please help me.
Waiting for your valuable reply.....
V.Thandava Krishna.
======================================================
Option Explicit
Public Function GetHTMLCode(URL As String, strSearch As String) As
String
On Error Resume Next
Dim doc1 As New HTMLDocument
Dim doc2 As HTMLDocument
Set doc2 = doc1.createDocumentFromUrl("http://www.google.com/";,
"null")
Do Until doc2.readyState = "complete"
DoEvents
Loop
GetHTMLCode = doc2.documentElement.outHTML
End Function
============================================================
============================================================
A run time error has occured.
Do u want to debug?
Line8:
Error: Can't move the focus to the controls because it is invisible,
not eanbled, or of a type that does not accept the focus.
============================================================
Reply
Originally posted by: Manuel
This code was a great help. Now I have another problem: I'm trying to automate form submission and get the result in the HTMLDocument but it just doesn't work.
The code:
dim page as MSHTML.HTMLDocument
dim baseObject as New MSHTML.HTMLDocument
Set page = baseObject.createDocumentFromUrl("xpto.htm", vbNullString)
While page.ReadyState <> "complete"
DoEvents
Wend
MsgBox "Submitting to myForm.action=" & page.All.myForm.Action
page.All.myForm.submit
While page.ReadyState <> "complete"
DoEvents
Wend
After this, my "page" object hasn't changed. I even tried using page.parentWindow.document (thus getting the updated document for the container window - I thought) but with no success.
Thanks in advance,
Manuel
Originally posted by: Rangan Basu
Great code man! just what I was looking for .
ThanX Nicolas
Rangan
Originally posted by: Arjun Subbarao
Hi
that code helped me a LOT. Thanks man, it was exactly what I was looking for.
If anyone reading this is getting an error while compiling saying that the object reference is not found, please read what Manuel has written. Do that and your troubles are over.