Display a Web Page in a Plain C Win32 Application

Environment: Win32, VC6 (recommended, but not required), IE 4.0 or better (or some browser that supports OLE in-place automation).

Introduction

There are numerous examples that demonstrate how to embed Internet Explorer in your own window. But these examples typically use Microsoft Foundation Classes (MFC), .NET, C#, or at least the Windows Template Library (WTL) because those frameworks have pre-fabricated “wrappers” to easily give you an “HTML control” to embed in your window. If you’re trying to use plain C, without MFC, WTL, .NET, C#, or even any C++ code at all, then there is a dirth of examples and information how to deal with OLE/COM objects such as IE’s IWebBrowser2. Here is an article and working example in C to specifically show you what you need to do in order to embed IE in your own window, and more generally, show you how to interact with OLE/COM objects and create your own objects in plain C.

In fact, I’ve even wrapped up this plain C code into a Dynamic Link Library (DLL) so that you can simply call one function to display a Web page or some HTML string in a window you create. You won’t even need to get your hands dirty with OLE/COM (unless you plan to modify the source of the DLL).

With standard Win32 controls such as Static, Edit, Listbox, Combobox, and so forth, you obtain a handle to the control (such as an HWND) and pass messages (via SendMessage) to it to manipulate it. Also, the control passes messages back to you (for example, by putting them in your own message queue, and you fetch them with GetMessage) when it wants to inform you of something or give you some data.

Not so with an OLE/COM object. You don’t pass messages back and forth. Rather, the COM object gives you some pointers to certain functions that you can call to manipulate the object. For example, the IWebBrowser2 object will give you a pointer to a function you can call to cause the browser to load and display a Web page in one of your windows. And, if the COM object needs to notify you of something or pass data to you, you will be required to write certain functions in your program, and provide (to the COM object) pointers to those functions so the object can call those functions. In other words, you sort of need your own embedded COM object(s) inside your program. Most of the real hassle in C will involve these embedded COM objects that you may need to provide so some other object can call your functions to fully interact with your program.

In conclusion, you call functions in the object, and it calls functions in your program. It’s sort of like calling functions in a DLL, but with the DLL able to call functions inside your C program too—sort of like with a “callback.” But, unlike with a DLL, you don’t use LoadLibrary() and GetProcAddress() to obtain the pointers to the COM object’s functions. Instead, you’ll use a different operating system function to get a pointer to an object, and then use that object to obtain pointers to its functions.

An OLE/COM Object and Its VTable

So, at its simplest, a COM object itself is really just a C structure that contains pointers to functions that someone can call. These pointers to functions must be the first thing inside of the structure. There can be other data elements in the structure later on, but the pointers must be first. This is a very important thing to note (because we’ll be creating our own OLE/COM objects inside our C example, and you’ll need to understand how to declare and set up such a COM “structure”).

Actually, the first thing inside of the object will be a pointer to another structure that actually contains the pointers to the functions. In essence, this second structure is just an array of pointers to various functions. We refer to this array as a “VTable.” Also, the first three pointers in the VTable must be to three specific functions. We’ll call them QueryInterface(), AddRef(), and Release(). (When you create your own objects, you can name the functions anything you want, but they must take certain args, and do certain things, and return a certain value. We’ll get into that later.) Here are the function definitions for the three functions:


HRESULT STDMETHODCALLTYPE QueryInterface(IUnknown FAR* This,
REFIID riid, LPVOID FAR* ppvObj);
HRESULT STDMETHODCALLTYPE AddRef(IUnknown FAR* This);
HRESULT STDMETHODCALLTYPE Release(IUnknown FAR* This);

Right now, let’s not worry about the details of what those args are, and what an HRESULT is and means.

The first thing inside of a COM object must be a pointer to a VTable that contains pointers to at least those three functions, and the pointers must be named QueryInterface, AddRef, and Release. (These three are referred to as the IUnknown interface.) And, QueryInterface must be defined as a pointer to the QueryInterface function we defined above. AddRef must be defined as a pointer to the AddRef function. And, Release must be defined as a pointer to the Release function.

So, here is the most simple example of the structure for a OLE/COM object which I’ll just call “MYOBJECT.” We’ll first define its VTable in a structure called MYOBJECT_VTBL, and then define the structure of the MYOBJECT object.


/* This is the VTable for MYOBJECT. It must start out with at
* least the following three pointers.
*/

struct MYOBJECT_VTBL {
(HRESULT STDMETHODCALLTYPE *QueryInterface)(IUnknown FAR* This,
REFIID riid, LPVOID FAR* ppvObj);
(HRESULT STDMETHODCALLTYPE *)AddRef(IUnknown FAR* This);
(HRESULT STDMETHODCALLTYPE *)Release(IUnknown FAR* This);
/* There would be other pointers here if this object had more
* functions. */

}

/* This is the MYOBJECT structure */
struct MYOBJECT {
/* The first thing in the object must be a pointer
* to its VTable! */

struct MYOBJECT_VTBL *lpVtbl;

/* The Object may have other embedded objects here, or some
* private data.
* But all that must be after the above VTable pointer.
*/

}

As you can see, a COM object always starts with a pointer to its VTable, and the first three pointers in the VTable will always be named QueryInterface, AddRef, and Release. What sorts of other functions are in its VTable, and what the name of their pointers are, depends upon what type of object it is. For example, the browser object will undoubtedly have different functions than some object that plays music. But, all OLE/COM objects begin with a pointer to their VTable, and the first three VTable pointers are to the object’s QueryInterface, AddRef, and Release functions. That is the law. Obey it.

Of course, when you create your own COM Object, you’ll put the three “IUnknown” functions in your program. For example, maybe you’ll have three functions named MyQueryInterface(), MyAddRef(), and MyRelease() somewhere as so:


HRESULT STDMETHODCALLTYPE MyQueryInterface(IUnknown FAR* This,
REFIID riid, LPVOID FAR* ppvObj)
{
return(S_OK);
}

HRESULT STDMETHODCALLTYPE MyAddRef(IUnknown FAR* This)
{
return(S_OK);
}

HRESULT STDMETHODCALLTYPE MyRelease(IUnknown FAR* This)
{
return(S_OK);
}

And of course, you need to initialize your COM object to store pointers to those functions in its VTable. Here’s an example of us creating a MYOBJECT struct named Example with its VTable named ExampleTable, and initializing it:


int main()
{
struct MYOBJECT Example;
struct MYOBJECT_VTBL ExampleTable;

ExampleTable.QueryInterface = MyQueryInterface;
ExampleTable.AddRef = MyAddRef;
ExampleTable.Release = MyRelease;
Example.lpVtbl = &ExampleTable;
}

We have now created a OLE/COM object (for example, Example is that object), fully initialized with a VTable containing pointers to its functions. Now, all we need do is pass a pointer to this struct to some operating system function, and then some other object (such as the browser object) will be able to call MyQueryInterface(), MyAddRef(), and MyRelease() within our C executable. That’s not so bad, right?

QueryInterface(), AddRef(), and Release()

Well, there is more to know. Let’s take a look at the definitions of those three functions. You’ll notice that the first arg to each is a pointer to an IUnknown struct. Actually, we need to change these definitions as so for our MYOBJECT object:


HRESULT STDMETHODCALLTYPE QueryInterface(MYOBJECT FAR* This,
REFIID riid, LPVOID FAR* ppvObj);
HRESULT STDMETHODCALLTYPE AddRef(MYOBJECT FAR* This);
HRESULT STDMETHODCALLTYPE Release(MYOBJECT FAR* This);

And our MYOBJECT_VTBL needs to be as follows:


struct MYOBJECT_VTBL {
(HRESULT STDMETHODCALLTYPE *QueryInterface)(MYOBJECT FAR*
This, REFIID riid, LPVOID FAR*
ppvObj);
(HRESULT STDMETHODCALLTYPE *)AddRef(MYOBJECT FAR* This);
(HRESULT STDMETHODCALLTYPE *)Release(MYOBJECT FAR* This);
}

You may ask, “Are you telling me that the first arg passed to each of my tbree functions is going to be a pointer to some MYOBJECT struct?” Yes, indeed. For example, when we give the browser object a pointer to our Example MYOBJECT, and the browser object uses that struct to call MyQueryInterface(), the first arg passed will be that pointer to Example. In this way, we know which exact struct was used to call MyQueryInterface(). Furthermore, we could add data fields to the end of the struct to store per-instance data for the struct. So, we never need reference any global data within our functions, and can make them fully re-entrant—able to be used with plenty of MYOBJECT structs, should we need more than one.

“Wait a minute! This is starting to look suspiciously like C++! It looks like the invisible ‘this’ pointer of a C++ class!” you scream. Damn right. That’s exactly what COM is built upon. So, you’re effectively creating C++ classes in your C code (but without some of the other baggage/bloat of C++).

In conclusion, when one of your functions is called by somebody, the very first arg is always a pointer to whatever object (for example, structure) was used to obtain the VTable. This mimics the behavior of a C++ class, and a COM Object is analogous to that.

After you obtain a pointer to some COM object (or structure), such as the browser object, you’re going to do the same thing when you call the object’s functions. You’ll find a pointer to some desired function somewhere within that object’s VTable. And when you call the function, the first arg you pass will always be the pointer to that COM object.

Generic Datatypes (or, BSTR, VARIANT)

You may be thinking that things are a little complex, but not too bad. Well, there’s another wrinkle. Most OLE/COM objects are designed to be called by a program written in most any language. To that end, the object tries to abstract datatypes. What do I mean by this? Well, take a string in ANSI C. A C string is a series of 8-bit bytes ending with a 0 byte. But that isn’t how strings are stored in Pascal. A Pascal string starts with a byte that tells how many more bytes follow. In other words, a Pascal string starts with a length byte, and then the rest of the bytes. There is no terminating 0 byte. And what about UNICODE versus ANSI? With UNICODE, every character in the string is actually 2 bytes (or, a short).

So, to support any language (as well as extensions in each language such as UNICODE), most COM objects instead employ generic datatypes that accommodate most every language. For example, if a COM object is passed a string, the string will often take the form of a BSTR. What is a BSTR? Well, it is sort of a UNICODE Pascal string. Every character is 2 bytes, and it starts with an unsigned short that tells how many more shorts follow. This accomodates the “string” datatype of most every language/extension. But, it also means that sometimes you’ll need to reformat your C strings to a BSTR when you want to pass a string to some COM object’s functions. Fortunately, there is an operating system function called SysAllocString to help do that.

And there are other “generic datatypes” too, such as a generic structure that holds a numeric (for example, DWORD) datatype in a certain way that accomodates just about any language.

In fact, some COM objects’ functions can operate upon a variety of datatypes, so they employ another structure called a VARIANT. For example, let’s say you have a Printer object that has a Print() function. And, let’s say that this Print() function can be passed either a string, or a DWORD, and maybe a variety of other datatypes, and it will print whatever it is passed, regardless. For example, if passed a string, it will print the characters of that string. If passed a DWORD, it will first do something akin to calling sprintf(myBuffer, “%d”, myDword) and printing out the resulting string. Now, this Print() function needs some way to know whether it is being passed a string or a DWORD. So, we wrap the string (or BSTR), or the generic structure for a numeric value, into a VARIANT struct. Then, we set the first field of this VARIANT struct to VT_BSTR if we wrapped a BSTR, or we set it to VT_DECIMAL if we wrapped a DWORD. That way, the Print() function can be written to support being passed many different types of data, and it can determine what type of data is being passed to it (by inspecting the VARIANT’s vt field).

In conclusion, when dealing with objects such as the browser object, some of its functions may require you to convert/stuff your data into one of these generic datatypes (structs), and then also perhaps wrap that in a VARIANT struct.

Your IStorage/IOleInPlaceFrame/IOleClientSite/IOleInPlaceSite Objects

Now that you’ve got some background on COM objects, let’s examine what you need to host the browser object. You may wish to peruse the source code file CWebPage.c as you read the following discussion.

First of all, the browser object expects you to provide four objects. You need an IStorage, IOleInPlaceFrame, IOleClientSite, and an IOleInPlaceSite object. That’s four structs. And each has its own VTable. All of these objects (and their VTables) are defined in include files with the C interpreter. So, they each have their own specific set of functions in the VTable.

Let’s just examine the IStorage object. It has a VTable that is defined as a IStorageVtbl struct. Essentially, it’s an array of 18 pointers to functions that you must supply in your program. (In other words, you have to write 18 specific functions just for your IStorage object alone. That’s why people use things such as MFC, .NET, C#, and WTL to ease this job.) Of course, the first three functions will be the QueryInterface(), AddRef(), and Release() functions for your IStorage object. In CWebPage.c, I’ve named those three functions Storage_QueryInterface(), Storage_AddRef(), and Storage_Release(). In fact, I’ve named the other 15 functions starting with Storage_. They have names such as Storage_OpenStream(), Storage_CopyTo(), and so forth. Your IStorage functions are called by the browser object to manage storing/loading data to disk. What the specific purpose of each of those functions is, and what arguments are passed to it, you can check for yourself by looking through the documentation on MSDN about the IStorage object.

So, to create the VTable for my IStorage object, the easiest thing to do is just declare it as a global, and initialize it with the pointers to my 18 functions. Here is how I did that in my C source:


IStorageVtbl MyIStorageTable = {Storage_QueryInterface,
Storage_AddRef,
Storage_Release,
Storage_CreateStream,
Storage_OpenStream,
Storage_CreateStorage,
Storage_OpenStorage,
Storage_CopyTo,
Storage_MoveElementTo,
Storage_Commit,
Storage_Revert,
Storage_EnumElements,
Storage_DestroyElement,
Storage_RenameElement,
Storage_SetElementTimes,
Storage_SetClass,
Storage_SetStateBits,
Storage_Stat};

So, I now have a global variable named MyIStorageTable, which is a properly initialized VTable for my IStorage object.

Next, I need to create my IStorage object. Again, the easiest thing to do is just declare it as a global and initialize it. Because there is only one field in an IStorage, and that’s the pointer to its VTable, here it is:


IStorage MyIStorage = { &MyIStorageTable };

So, I now have a global variable named MyIStorage, which is a properly initialized IStorage object. It is ready to be passed to some operating system function that will give it to the browser object so it can call any of the above 18 functions. Of course, you’ll find those 18 functions in CWebPage.c, too. (But mostly, they do nothing because these functions aren’t actually utilized by the browser object. Nevertheless, we really do have to provide at least some stubs just in case some wiseguy takes our IStorage object and tries to call our functions.)

In CWebPage.c, you’ll see that also I declare my other objects’ VTables as globals. But I also happen to add an extra field to some of those other objects for my own private data. Notice that I add the data at the end of the object, after the pointers to any VTable. That is very important. The VTable pointer must come first. And (unlike with the IStorage object, which has no extra data), the extra data that is stored is window-specific. So, I need a different IOleInPlaceFrame, IOleClientSite, and IOleInPlaceSite struct per window. For this reason, I’ll allocate them when I create a window. (Alternately, I could create each within its AddRef function. AddRef is for allocating any extra resources that a particular instance of the object needs. But in this case, it would be more difficult to initialize the object, so I don’t do that.)

You can consult your MSDN documentation to learn what the functions in your IOleInPlaceFrame, IOleClientSite, and IOleInPlaceSite VTables are supposed to do, and what is passed to them. In CWebPage.c, I employ only as much functionality as is needed to display a Web page in a window of my own creation.

The Browser Object

After you’ve set up those four preceding objects, you’re ready to obtain a browser object. You can do that with a call to the operating system function OleCreate(). (But first, you should call OleInitialize() once to make sure that the OLE system is initialized for your process.)

The function EmbedBrowserObject() is where we obtain a browser object and embed it into a particular window. We need do this only once, so we call EmbedBrowserObject right when we create the window.

OleCreate() is passed numerous args, one of them being a pointer to our IOleClientSite object (which must be fully initialized before we pass it to OleCreate), and another being a pointer to our initialized IStorage object. The first two args we pass to OleCreate tell it that we want a browser object created and returned to us. We pass a handle to where we want OleCreate to return the pointer.

If all goes well, OleCreate will return a pointer to a newly created browser object that we can embed in our window. The object is not yet embedded. It is merely created. But we’ll get to that embedding momentarily.

So, how do we embed the browser object? We need to call one of the browser object’s functions. No problem. We just got a pointer to its object, and we know that the first field in the object (for example, lpVtbl) is a pointer to its VTable. So we just grab the pointer to the desired function, and use it to call that function. In fact, we call several browser object functions that way. We call SetHostNames() to pass the browser the name of our application (so it can display that in its own message boxes). Then, we call DoVerb() to send it a command that tells it to embed itself in our window (OLEIVERB_SHOW). Of course, we also pass our window handle. Now, while we’re inside of this call to DoVerb(), the browser object is going to call some of our IOleClientSite functions. It will have called several of them before DoVerb() returns.

The browser object (or struct) has another object called an IWebBrowser2 associated with it. And, the IWebBrowser2 has its own VTable of functions. We want to get a pointer to this other object so we can get its VTable and call some of its functions. Needless to say, the very first field in the IWebBrowser2 object will be a pointer to its VTable. So how do we get a pointer to this browser object’s IWebBrowser2 object? Is it embedded inside of the browser object (struct)? Maybe, but maybe not. Is it contained in some internal list inside the browser object? Maybe, but maybe not. So how do we get access to it? We ask the browser object to give us the pointer to it. And how do we do that? We call the browser object’s QueryInterface function, and ask it to return a pointer to its associated IWebBrowser2 object. And that is the whole purpose of an object’s QueryInterface function—to return other objects associated with that object. Remember that all OLE/COM objects have a QueryInterface function. It’s the first function in the object’s VTable. In fact, the IWebBrowser2 object will have its own QueryInterface function to return pointers to other objects associated with it. The second arg we pass to QueryInterface tells what type of object we wish returned. Here we want a type of IID_IWebBrowser2.

And so we ask the browser object to return a pointer to its IWebBrowser2 object, and then we use the IWebBrowser2 object’s VTable to call a few of its functions to position the embedded browser object.

And that’s the essense of embedding a browser object in our window. We haven’t yet displayed a Web page. We have another function we can call to do that (after we’re finished with EmbedBrowserObject). We can call DisplayHTMLPage() to display a URL or HTML file on disk. What we do in DisplayHTMLPage is very similar to what we do in EmbedBrowserObject. We use the browser object’s QueryInterface() to grab pointers to other objects associated with it, and use the VTables of those other objects to call their functions to display a URL or HTML file on disk. Again, you can consult the MSDN documentation to learn more about the objects we’re asking for and their functions we’re calling.

There is one more thing to note about what we do in EmbedBrowserObject. You’ll notice at the end that we call the IWebBrowser2 object’s Release function. When you’re done using an object, you should always call its Release function. This frees up any resources that the object may have allocated as a result of you asking for a pointer to it. For example, if the object itself was dynamically allocated when you asked for a pointer to it, it will free itself when you call its Release function. Failure to follow this rule could result in memory leaks. Of course, after you Release an object, that pointer to it is no longer valid. You’ll have to ask for the pointer again if you need it. And then you’ll need to Release it again. See how that works? Now you know why we don’t Release the browser object itself until we’re finally done with its pointer. (In other words, we don’t call its Release function in EmbedBrowserObject. Instead, we defer that until later—when we’re finally done using the browser object.)

In fact, you can create several browser objects if desired, for example, if you wanted several windows—each hosting its own browser object so that each window could display its own Web page. In fact, CWebPage.c creates two windows that each host a browser object. (So we call EmbedBrowserObject once for each window.) In one window, we call DisplayHTMLPage to display Microsoft’s Web page. In another window, we call DisplayHTMLStr() to display some HTML string in memory.

Indeed, after we’ve embedded a browser object, we can call DisplayHTMLPage or DisplayHTMLStr repeatedly to change what is being displayed.

When you’re finally done with the browser object, you need to Release it to free any resources it used. We do that in UnEmbedBrowserObject(). Of course, this needs to be done only once, so we do it right when the window is being destroyed. And we need to call OleUninitialize() before our program exits.

The Example Code

CWebPage.c is a complete C example with everything in one source file. Study this to familiarize yourself with the technique of using the browser object in your own window.

The directory DLL contains a DLL that has the functions EmbedBrowserObject, UnEmbedBrowserObject, DisplayHTMLPage, and DisplayHTMLStr in it. The DLL also contains all of the IStorage, IOleInPlaceFrame, IOleClientSite, and IOleInPlaceSite VTables and their functions. The DLL also calls OleInitialize and OleUninitialize on your behalf. So, to use this DLL, you don’t need to put any OLE/COM coding in your C program at all. It’s all in the DLL instead. And there is a small example called Example.c that uses the DLL. It’s just CWebPage.c with all the OLE/COM stuff ripped out of it and replaced with calls to use the DLL.

Improvements that could be made to the example are to add an object to sink events so that you can prevent the context menu from popping up. Also, the browser object’s display area should be resized when a window receives a WM_SIZE message.

Downloads

Download source – 48 Kb

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read