Function to Return HTML Source of a URL

Environment: Visual C++ 6.0

Here's a function that gives you access to the source html of a URL. As written the function stores the results to a .txt file, but you could easily modified the function to fit your needs. From there you can parse the data, create a page on the fly, and use the Navigate2 method to display the results in a browser. This function could be useful for develping html views that block adds or give the user the option of text-only page views. With a little imagination you could probably come up with many other uses for this code.

The GetSourceHtml function makes use of the CInternetSession class, so be sure to place #include "afxinet.h" below #include "stdafx.h" in the source file that contains the GetSourceHtml function.

To use GetSourceHtml, pass it a URL as a CString in the following format: GetSourceHtml( _T("http://www.codeguru.com") );. You can then use Notepad to view the results. You will find it in the C:\ directory as rawHtml.txt

BOOL GetSourceHtml(CString theUrl) 
{
 // this first block does the actual work
 CInternetSession session;
 CInternetFile* file = NULL;
 try
 {
  // try to connect to the URL
  file = (CInternetFile*) session.OpenURL(theUrl); 
 }
 catch (CInternetException* m_pException)
 {
  // set file to NULL if there's an error
  file = NULL; 
  m_pException->Delete();
 }

 // most of the following deals with storing the html to a file
 CStdioFile dataStore;

 if (file)
 {
  CString somecode;

  BOOL bIsOk = dataStore.Open(_T("C:\\rawHtml.txt"),
                              CFile::modeCreate 
                              | CFile::modeWrite 
                              | CFile::shareDenyWrite 
                              | CFile::typeText);

  if (!bIsOk)
   return FALSE;

  // continue fetching code until there is no more
  while (file->ReadString(somecode) != NULL) 
  {
   dataStore.WriteString(somecode);
  }
  
  file->Close();
  delete file;
 }
 else
 {
  dataStore.WriteString(_T("Could not establish a connection with the server..."));	
 }
}