Creating an Internet Explorer Helper Class

Environment: Internet

Introduction

The purpose of this article is to show how to use IWebBrowser2, IHTMLDocument2, and IHTMLElement objects.

Creating a New Web Browser Object

Let's start with a very simple example: How to create a new Internet Explorer window. The code below shows how to do this:

HRESULT hr;
IWebBrowser2* pWebBrowser = NULL;
hr = CoCreateInstance (CLSID_InternetExplorer, NULL, CLSCTX_SERVER,
                       IID_IWebBrowser2, (LPVOID*)&pWebBrowser);

if (SUCCEEDED (hr) && (pWebBrowser != NULL))
{
  m_pWebBrowser->put_Visible (VARIANT_TRUE);
  // OK, we created a new IE Window and made it visible
  // You can use the pWebBrowser object to do whatever you want
  // to do!
}
else
{
  // Failed to create a new IE Window. Check out the pWebBrowser
  // object and if it is not NULL (should never happen), then
  // release it!
  if (pWebBrowser) pWebBrowser->Release ();
}

Connecting to a Running Instance of IE

Creating a new IE window is a fairly easy task. But, what if you want to use an existing Internet Explorer window instead of creating a new one? Well, in this case, the task is more complicated. The function below finds an Internet Explorer window. sTitleToSearch is the title of the Web page to be searched for. Usage of wildcard characters is allowed. If you want to find any IE window; just enter "*" as the title.

bool CMyInternetExplorer::FindUsingTitle (const CString &
                                          sTitleToSearch)
{
  if (m_pWebBrowser != NULL)
  {
      m_pWebBrowser->Release ();
      m_pWebBrowser = NULL;
  }

  HRESULT hr;
  SHDocVw::IShellWindowsPtr spSHWinds;
  hr = spSHWinds.CreateInstance
                 (__uuidof(SHDocVw::ShellWindows));

  if (FAILED (hr))
      return false;

  ASSERT (spSHWinds != NULL);

  long nCount = spSHWinds->GetCount ();

  IDispatchPtr spDisp;

  for (long i = 0; i < nCount; i++)
  {
    _variant_t va (i, VT_I4);
    spDisp = spSHWinds->Item (va);

    IWebBrowser2 * pWebBrowser = NULL;
    hr = spDisp.QueryInterface (IID_IWebBrowser2, &
                                pWebBrowser);

    if (pWebBrowser != NULL)
    {
      HRESULT hr;
      IDispatch* pHtmlDocDispatch = NULL;
      IHTMLDocument2 * pHtmlDoc = NULL;

      // Retrieve the document object.
      hr = pWebBrowser->get_Document
                       (&pHtmlDocDispatch);

      if (SUCCEEDED (hr) && (pHtmlDocDispatch != NULL))
      {
        // Query for IPersistStreamInit.
        hr = pHtmlDocDispatch->QueryInterface
                               (IID_IHTMLDocument2,
                                (void**)&pHtmlDoc);
        if (SUCCEEDED (hr) && (pHtmlDoc != NULL))
        {
            CString sTitle;

            HWND hWnd = NULL;
            pWebBrowser->get_HWND ((long*)(&hWnd));
            if (::IsWindow (hWnd))
            {
              int nLen = ::GetWindowTextLength (hWnd);
              ::GetWindowText (hWnd,
                               sTitle.GetBufferSetLength
                               (nLen), nLen + 1);
              sTitle.ReleaseBuffer ();
             }

          // If I cannot get the window title
          // (should never happen, though)
          // So, let's just use the title of the document
          if (sTitle.IsEmpty ())
            {
              BSTR bstrTitle;
              hr = pHtmlDoc->get_title (&bstrTitle);
              if (!FAILED (hr))
                {
                  sTitle = bstrTitle;
                  SysFreeString (bstrTitle); 
                 }
             }

              if (StringHelper::WildcardCompareNoCase
                  (sTitleToSearch, sTitle))
              {
                m_pWebBrowser = pWebBrowser;
                pHtmlDoc->Release ();
                pHtmlDocDispatch->Release ();
                // Exit the method safely!
                return true;
              }
            pHtmlDoc->Release();
          }
        pHtmlDocDispatch->Release ();
      }
    pWebBrowser->Release ();
  }
}

return false;
}

This approach is described in MSDN in more detail. Click here to get more information.

This approach uses the SHDocVw.ShellWindows collection to enumerate all the instances of Shell Windows. The ShellWindows object represents a collection of the open windows that belong to the shell. In fact, this collection contains references to Internet Explorer as well as other windows belonging to the shell, such as the Windows Explorer. To differentiate between Internet Explorer and other shell windows, we just try to get the HTML document of the shell window. If we get the document successfully, this instance of ShellWindow is in fact an Internet Explorer window.

Navigate to a Web Page

Now, after we get the Web browser object and store it in the variable m_pWebBrowser, it is very easy to navigate to a Web page.

void CMyInternetExplorer::Navigate(LPCTSTR lpszURL,
                                   DWORD dwFlags /* = 0 */,
    LPCTSTR lpszTargetFrameName /* = NULL */ ,
    LPCTSTR lpszHeaders /* = NULL */, LPVOID lpvPostData /*
                           = NULL */,
    DWORD dwPostDataLen /* = 0 */)
{
    CString strURL (lpszURL);
    BSTR bstrURL = strURL.AllocSysString ();
    
    COleSafeArray vPostData;
    if (lpvPostData != NULL)
    {
        if (dwPostDataLen == 0)
            dwPostDataLen = lstrlen ((LPCTSTR) lpvPostData);

        vPostData.CreateOneDim (VT_UI1, dwPostDataLen, lpvPostData);
    }
    
    m_pWebBrowser->Navigate (bstrURL, COleVariant
                             ((long) dwFlags, VT_I4),
                              COleVariant (lpszTargetFrameName,
                              VT_BSTR), vPostData,
                              COleVariant
                              (lpszHeaders, VT_BSTR));

    SysFreeString (bstrURL);
}

Wait Until the Web Page Is Loaded

After starting to load a Web page using the function above, to wait until the Web page is completely loaded, we can use the READYSTATE property of the IWebBrowser2 object.

bool CMyInternetExplorer::WaitTillLoaded (int nTimeout)
{
    READYSTATE result;
    DWORD nFirstTick = GetTickCount ();

    do
    {
        m_pWebBrowser->get_ReadyState (&result);

        if (result != READYSTATE_COMPLETE)
            Sleep (250);

        if (nTimeout > 0)
        {
            if ((GetTickCount () - nFirstTick) > nTimeout)
                break;
        }
    } while (result != READYSTATE_COMPLETE);

    if (result == READYSTATE_COMPLETE)
        return true;
    else
        return false;
}

This function waits until the Web page is completely loaded or a timeout occurs. To wait indefinitely, set the nTimeout parameter to 0.

Find an Anchor on a Web Page

The following function searches for the specified anchor in a Web page. The anchor can be specified either by the Name, Outer Text, ToolTip, or the URL. The anchor element has the following syntax:

<a href = "anchor_URL" name ="anchor_name"
           title = "anchor_tooltip">Outer Text</a>

If the bClick parameter is set to true, and then if the anchor is found, it will also be clicked on.

bool CMyInternetExplorer::FindAnchor (bool bClick, bool bFocus,
                                      bool bName, bool bOuterText,
                                      bool bTooltip, bool bURL,
                                      LPCTSTR sName,
                                      LPCTSTR sOuterText,
                                      LPCTSTR sTooltip,
                                      LPCTSTR sURL)
{
    ASSERT (m_pWebBrowser != NULL);
    if (m_pWebBrowser == NULL)
        return false;

    HRESULT hr;
    IDispatch* pHtmlDocDispatch = NULL;
    IHTMLDocument2 * pHtmlDoc = NULL;
    bool bSearch = true;

    // Retrieve the document object.
    hr = m_pWebBrowser->get_Document (&pHtmlDocDispatch);
    if (SUCCEEDED (hr) && (pHtmlDocDispatch != NULL))
    {
        hr = pHtmlDocDispatch->QueryInterface (IID_IHTMLDocument2,
                                               (void**)&pHtmlDoc);
        if (SUCCEEDED (hr) && (pHtmlDoc != NULL))
        {
            IHTMLElementCollection* pColl = NULL;
            hr = pHtmlDoc->get_all (&pColl);

            if (SUCCEEDED (hr) && (pColl != NULL))
            {
                // Obtained the Anchor Collection...
                long nLength = 0;
                pColl->get_length (&nLength);

                for (int i = 0; i < nLength && bSearch; i++)
                {
                    COleVariant vIdx ((long)i, VT_I4);

                    IDispatch* pElemDispatch = NULL;
                    IHTMLElement * pElem = NULL;

                    hr = pColl->item (vIdx, vIdx, &pElemDispatch);

                    if (SUCCEEDED (hr) && (pElemDispatch != NULL))
                    {
                        hr = pElemDispatch->QueryInterface
                             (IID_IHTMLElement, (void**)&pElem);

                        if (SUCCEEDED (hr) && (pElem != NULL))
                        {
                            BSTR bstrTagName;
                            CString sTempTagName;
                            if (!FAILED (pElem->get_tagName
                                (&bstrTagName)))
                            {
                                sTempTagName = bstrTagName;
                                SysFreeString (bstrTagName);
                            }
                            
                            if (sTempTagName == _T ("a") ||
                                sTempTagName == _T ("A"))
                            {
                                IHTMLAnchorElement * pAnchor = NULL;
                                hr = pElemDispatch->
                                     QueryInterface(
                                       IID_IHTMLAnchorElement,
                                       (void**)&pAnchor);

                                if (SUCCEEDED (hr) &&
                                   (pAnchor != NULL))
                                {
                                    BSTR bstrName, bstrOuterText,
                                                   bstrURL,
                                                   bstrTooltip;
                                    CString sTempName, sTempOuter,
                                            sTempURL, sTempTooltip;

                                    if (!FAILED
                                        (pElem->get_outerText
                                         (&bstrOuterText)))
                                    {
                                        sTempOuter = bstrOuterText;
                                        SysFreeString
                                          (bstrOuterText);
                                    }
                                    if (!FAILED (pElem->get_title
                                                 (&bstrTooltip)))
                                    {
                                        sTempTooltip = bstrTooltip;
                                        SysFreeString (bstrTooltip);
                                    }
                                    if (!FAILED (pAnchor->get_name
                                                 (&bstrName)))
                                    {
                                        sTempName = bstrName;
                                        SysFreeString (bstrName);
                                    }
                                    if (!FAILED (pAnchor->get_href
                                                 (&bstrURL)))
                                    {
                                        sTempURL = bstrURL;
                                        SysFreeString (bstrURL);
                                    }

                                    // Do the comparison here!
                                    bool bMatches = true;
                                    if (bMatches && bName)
                                    {
                                        if (!StringHelper::
                                             WildcardCompareNoCase
                                             (sName, sTempName))
                                             bMatches = false;
                                    }
                                    if (bMatches && bOuterText)
                                    {
                                        if (!StringHelper::
                                            WildcardCompareNoCase
                                            (sOuterText,
                                             sTempOuter))
                                            bMatches = false;
                                    }
                                    if (bMatches && bURL)
                                    {
                                        if (!StringHelper::
                                            WildcardCompareNoCase
                                            (sURL, sTempURL))
                                           bMatches = false;
                                    }
                                    if (bMatches && bTooltip)
                                    {
                                        if (!StringHelper::
                                            WildcardCompareNoCase
                                            (sTooltip,
                                             sTempTooltip))
                                           bMatches = false;
                                    }

                                    if (bMatches)
                                    {
                                        // No need to search more!
                                        bSearch = false;

                                        if (bFocus)
                                            pAnchor->focus ();
                                        if (bClick)
                                            pElem->click ();
                                    }
                                    pAnchor->Release ();
                                }
                            }
                            pElem->Release ();
                        }
                        pElemDispatch->Release ();
                    }
                }
                pColl->Release ();
            }
            pHtmlDoc->Release();
        }
        pHtmlDocDispatch->Release ();
    }

    if (bSearch == false)
        return true;

    return false;
}

The idea here is very simple. We first enumerate all IHTMLElement objects using the get_all function of IHTMLDocument2. Then, I check all of the elements to see whether it is an anchor object (IHTMLAnchorElement) or not by checking its tag name. If it is an "a" or an "A," it is an anchor object. Then, I try to get the IHTMLAnchor object by using the QueryInterface function. The reason that I check the name instead of just using the QueryInterface function is performance related. I guess it is much faster to check a tag's name than trying to get IHTMLAnchorElement by using the QueryInterface function.

Fill a Form on a Web Page

The idea here is similar to the one above. Instead of finding the anchor element, I try to find all the input elements and then do the same operations. The syntax for an input element is as follows:

<input type="input_type" value="input_value" name="input_name">
bool CMyInternetExplorer::FindInput  (bool bClick, bool bSelect,
                                      bool bChangeValue,
                                      bool bSetCheck, bool bType,
                                      bool bName, bool bValue,
                                      LPCTSTR sTypeToLook,
                                      LPCTSTR sNameToLook,
                                      LPCTSTR sValueToLook,
                                      bool bNewCheckValue,
                                      LPCTSTR sNewValue)
{
    ASSERT (m_pWebBrowser != NULL);
    if (m_pWebBrowser == NULL)
        return false;

    HRESULT hr;
    IDispatch* pHtmlDocDispatch = NULL;
    IHTMLDocument2 * pHtmlDoc = NULL;
    bool bSearch = true;

    // Retrieve the document object.
    hr = m_pWebBrowser->get_Document (&pHtmlDocDispatch);
    if (SUCCEEDED (hr) && (pHtmlDocDispatch != NULL))
    {
        hr = pHtmlDocDispatch->QueryInterface (IID_IHTMLDocument2,
                                               (void**)&pHtmlDoc);
        if (SUCCEEDED (hr) && (pHtmlDoc != NULL))
        {
            IHTMLElementCollection* pColl = NULL;
            hr = pHtmlDoc->get_all (&pColl);

            if (SUCCEEDED (hr) && (pColl != NULL))
            {
                // Obtained the Anchor Collection...
                long nLength = 0;
                pColl->get_length (&nLength);
                
                for (int i = 0; i < nLength && bSearch; i++)
                {
                    COleVariant vIdx ((long)i, VT_I4);

                    IDispatch* pElemDispatch = NULL;
                    IHTMLElement * pElem = NULL;

                    hr = pColl->item (vIdx, vIdx, &pElemDispatch);

                    if (SUCCEEDED (hr) && (pElemDispatch != NULL))
                    {
                        hr = pElemDispatch->QueryInterface
                             (IID_IHTMLElement, (void**)&pElem);

                        if (SUCCEEDED (hr) && (pElem != NULL))
                        {
                            BSTR bstrTagName;
                            CString sTempTagName;
                            if (!FAILED (pElem->get_tagName
                                         (&bstrTagName)))
                            {
                                sTempTagName = bstrTagName;
                                sTempTagName.MakeLower ();
                                //AfxMessageBox (sTempTagName);
                                SysFreeString (bstrTagName);
                            }
                            if (sTempTagName == _T ("input"))
                            {
                                IHTMLInputElement * pInputElem
                                                  = NULL;
                                hr = pElemDispatch->QueryInterface
                                     (IID_IHTMLInputElement,
                                      (void**)&pInputElem);

                                if (SUCCEEDED (hr) &&
                                    (pInputElem != NULL))
                                {
                                    BSTR bstrType, bstrName,
                                    bstrValue; CString sTempType,
                                    sTempName, sTempValue;
                                    
                                    if (!FAILED
                                        (pInputElem->get_type
                                         (&bstrType)))
                                    {
                                        sTempType = bstrType;
                                        SysFreeString (bstrType);
                                    }
                                    if (!FAILED
                                        (pInputElem->get_name
                                         (&bstrName)))
                                    {
                                        sTempName = bstrName;
                                        SysFreeString (bstrName);
                                    }
                                    if (!FAILED
                                        (pInputElem->get_value
                                         (&bstrValue)))
                                    {
                                        sTempValue = bstrValue;
                                        SysFreeString (bstrValue);
                                    }
                                    // Do the comparison here!
                                    bool bMatches = true;
                                    if (bMatches && bType)
                                    {
                                        if (!StringHelper::
                                            WildcardCompareNoCase
                                            (sTypeToLook,
                                             sTempType))
                                            bMatches = false;
                                    }
                                    if (bMatches && bName)
                                    {
                                        if (!StringHelper::
                                            WildcardCompareNoCase
                                            (sNameToLook,
                                             sTempName))
                                            bMatches = false;
                                    }
                                    if (bMatches && bValue)
                                    {
                                        if (!StringHelper::
                                            WildcardCompareNoCase
                                            (sValueToLook,
                                             sTempValue))
                                            bMatches = false;
                                    }

                                    if (bMatches)
                                    {
                                        // No need to search more!
                                        bSearch = false;

                                        if (bSetCheck)
                                        {
                                            if (bNewCheckValue)
                                                pInputElem->
                                                put_checked
                                                (VARIANT_TRUE);
                                            else
                                                pInputElem->
                                                put_checked
                                                (VARIANT_FALSE);
                                        }
                                        if (bChangeValue)
                                        {
                                            CString sTemp
                                                    (sNewValue);
                                            BSTR bstrNewValue =
                                            sTemp.AllocSysString ();
                                            pInputElem->
                                            put_value
                                            (bstrNewValue);
                                            SysFreeString
                                            (bstrNewValue);
                                        }
                                        if (bSelect)
                                            pInputElem->select ();

                                        if (bClick)
                                            pElem->click ();
                                    }
                                    pInputElem->Release ();
                                }
                            }
                            pElem->Release ();
                        }
                        pElemDispatch->Release ();
                    }
                }
                pColl->Release ();
            }
            pHtmlDoc->Release();
        }
        pHtmlDocDispatch->Release ();
    }

    if (bSearch == false)
        return true;

    return false;
}

Some Points that You Should Keep in Mind

  • The attached source code includes a few more functions to fill in the forms on Web pages.
  • The source code above does not search inside Frame objects. However, this should not be a hard task to implement. The frame objects should be enumerated and then the IHTMLDocument2 of each frame should be checked recursively. Maybe I will implement this in the next update.
  • To use an IHTMLInputElement object, you need at least Internet Explorer 5.0.
  • I am not an expert Internet Explorer programmer. Lately, I needed to automate Internet Explorer and then wrote these functions. I wrote most of them on my own, and they may have some bug fixes in them.
  • Please do not ask me why I used MFC, not ATL. The reason is that I don't know ATL. However, if you guys help me to convert this code into ATL, I would be very glad. I may even start learning ATL.
  • If you have any comments, suggestions, or corrections, please mail me at emindemirhan@yahoo.com.

Downloads

Download source files - 8 Kb


Comments

  • I've been looking for this for a while

    Posted by Legacy on 01/09/2004 12:00am

    Originally posted by: Nik Williams

    Oh wow, thanks so much! I've been looking on how to do this for quite some time but I could not get the pieces all together. Thanks for posting this VERY helpful info!

    Reply
  • Looks Great did you try it for HTTP forms also

    Posted by Legacy on 10/06/2003 12:00am

    Originally posted by: Ashok Mishra

    It is great for HTML parsing I was wondering , whether, you needed to trap the HTTp401 responses and do the formfill for login/password or do you have any idea how to do that.

    Thanks,
    Ashok

    Reply
  • Great,realy thanks for you,and one question

    Posted by Legacy on 08/18/2003 12:00am

    Originally posted by: henryzc

    i has finded this article for a month,haha,thanks for your work,but i also have a question.i use the web control in my project,and i don't want to load any html files at first,how could i insert html code into the web control,and save the html file?waitting for your reply

    Reply
  • Example needed

    Posted by Legacy on 07/24/2003 12:00am

    Originally posted by: Fred Ruessel

    Hi,
    can you create an example project?
    I can not compile it because IHTMLInputElement is not defined, and I don't know where to import the definition from.

    Regards,
    Fred

    Reply
  • Exactly what I was looking for!

    Posted by Legacy on 07/16/2003 12:00am

    Originally posted by: John Beckett

    Where was you when I needed this superb article 2 weeks ago? :)


    Reply
  • Looks Good

    Posted by Legacy on 07/14/2003 12:00am

    Originally posted by: Seven Up

    This just happens to be exactly what I have been looking for! Great Job !!

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • This paper introduces IBM Java on the IBM PowerLinux 7R2 server and describes IBM's implementation of the Java platform, which includes IBM's Java Virtual Machine and development toolkit.

  • A modern mobile IT strategy is no longer an option, it is an absolute business necessity. Today's most productive employees are not tied to a desk, an office, or a location. They are mobile. And your company's IT strategy has to be ready to support them with easy, reliable, 24/7 access to the business information they need, from anywhere in the world, across a broad range of communication devices. Here's how some of the nation's most progressive corporations are meeting the many needs of their mobile workers …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds