Environment: Visual C++ 6.0
Here’s a function that gives you access to the source html of a URL.
As written the function stores the results to a .txt file, but you could easily
modified the function to fit your needs. From there you can parse the data, create a
page on the fly, and use the Navigate2 method to display the results in a
browser. This function could be useful for develping html views that block adds
or give the user the option of text-only page views. With a little imagination you could probably
come up with many other uses for this code.
The GetSourceHtml function makes use of the CInternetSession class, so be sure to place #include “afxinet.h”
below #include “stdafx.h” in the source file that contains the GetSourceHtml function.
To use GetSourceHtml, pass it a URL as a CString in the following format: GetSourceHtml( _T(“http://www.codeguru.com”) );.
You can then use Notepad to view the results. You will find it in the C: directory as rawHtml.txt
BOOL GetSourceHtml(CString theUrl)
// this first block does the actual work
CInternetFile* file = NULL;
// try to connect to the URL
file = (CInternetFile*) session.OpenURL(theUrl);
catch (CInternetException* m_pException)
// set file to NULL if there’s an error
file = NULL;
// most of the following deals with storing the html to a file
BOOL bIsOk = dataStore.Open(_T(“C:\rawHtml.txt”),
// continue fetching code until there is no more
while (file->ReadString(somecode) != NULL)
dataStore.WriteString(_T(“Could not establish a connection with the server…”));