How does _ATL_DEBUG_INTERFACES work?

Have you ever wondered how this #define actually works? I did too. So,
I set out to research on the behavior and the result is this document.
By the way, to the uninitiated, this is a new #define introduced in VC
6.0 (ATL 3.0) that you can add to your stdafx.h before including
atlbase.h so that you can get the traces on the QueryInterface calls
and the reference count on your object’s interfaces.

If I were to describe how this works in a paragraph, I would say
something like this: "This works by introducing an interceptor class
around the actual class so that all calls to the 3 IUnknown interface
methods end up in this class which is then forwarded to the actual
class."

Now that I have spilled the beans, let me tell you how it actually does
this. If you have read "COM programmer’s Cookbook" article in MSDN, you
already know half the story.

Let me begin from the beginning.

First of all, Let me tell you about a struct buried in atlbase.h called
_QIThunk. You need to know about this since this is the one which
actually does the work of tracing the reference counting for you.

_QIThunk:

_QIThunk is a C++ struct which defines a few member variables and
overriden functions QueryInterface, AddRef() and Release() and 3 to
1024 function templates. This is the core class which actually does the
tracing of reference counting. This class looks like this.


#ifdef _ATL_DEBUG_INTERFACES
struct _QIThunk
{
STDMETHOD(QueryInterface)(REFIID iid, void** pp)
{
ATLASSERT(m_dwRef &gt= 0);
return pUnk-&gt QueryInterface(iid, pp);
}
STDMETHOD_(ULONG, AddRef)()
{
if (bBreak)
DebugBreak();
pUnk-&gt AddRef();
return InternalAddRef();
}
ULONG InternalAddRef()
{
if (bBreak)
DebugBreak();
ATLASSERT(m_dwRef &gt = 0);
long l = InterlockedIncrement(&m_dwRef);
ATLTRACE(_T(“%d &gt “), m_dwRef);
AtlDumpIID(iid, lpszClassName, S_OK);
if (l &gt m_dwMaxRef)
m_dwMaxRef = l;
return l;
}
STDMETHOD_(ULONG, Release)();

STDMETHOD(f3)();
STDMETHOD(f4)();
STDMETHOD(f5)();
.
.
STDMETHOD(f1024)();
_QIThunk(IUnknown* pOrig, LPCTSTR p, const IID& i, UINT n, bool b)
{
lpszClassName = p;
iid = i;
nIndex = n;
m_dwRef = 0;
m_dwMaxRef = 0;
pUnk = pOrig;
bBreak = b;
bNonAddRefThunk = false;
}
IUnknown* pUnk;
long m_dwRef;
long m_dwMaxRef;
LPCTSTR lpszClassName;
IID iid;
UINT nIndex;
bool bBreak;
bool bNonAddRefThunk;
void Dump()
{
TCHAR buf[256];
if (m_dwRef != 0)
{
wsprintf(buf, _T(“INTERFACE LEAK: RefCount = %d, MaxRefCount = %d, {Allocation = %d} “), m_dwRef, m_dwMaxRef, nIndex);
OutputDebugString(buf);
AtlDumpIID(iid, lpszClassName, S_OK);
}
else
{
wsprintf(buf, _T(“NonAddRef Thunk LEAK: {Allocation = %d}n”), nIndex);
OutputDebugString(buf);
}
}
};
#endif

#ifdef _ATL_DEBUG_INTERFACES
inline ULONG _QIThunk::Release()
{
if (bBreak)
DebugBreak();
ATLASSERT(m_dwRef &gt 0);
ULONG l = InterlockedDecrement(&m_dwRef);
ATLTRACE(_T(“%d &lt “), m_dwRef);
AtlDumpIID(iid, lpszClassName, S_OK);
pUnk- &gt Release();
if (l == 0 && !bNonAddRefThunk)
_pModule- &gt DeleteThunk(this);
return l;
}
inline static void atlBadThunkCall()
{
ATLASSERT(FALSE && “Call through deleted thunk”);
}
#define IMPL_THUNK(n)
__declspec(naked) inline HRESULT _QIThunk::f##n()
{
__asm mov eax, [esp+4]
__asm cmp dword ptr [eax+8], 0
__asm jg goodref
__asm call atlBadThunkCall
__asm goodref:
__asm mov eax, [esp+4]
__asm mov eax, dword ptr [eax+4]
__asm mov [esp+4], eax
__asm mov eax, dword ptr [eax]
__asm mov eax, dword ptr [eax+4*n]
__asm jmp eax
}

IMPL_THUNK(3)
IMPL_THUNK(4)
IMPL_THUNK(5)
.
.
IMPL_THUNK(1024)

#endif


You can see from the code that, QueryInterface delegates it to the contained
interface pointer. But, AddRef() and Release() do more than delegate. They
implement their own reference counter with m_dwRef and use that to put out those
neat little traces on the interface’ AddRef and Release. _QIThunk’s c’tor
basically takes five arguments.


 













Argument What is it..
IUnknown* pOrig This is actual Interface whose reference count we are
interested in.
LPCTSTR p This is the name of the Interface. (for eg. IFoo)
const IID& i IID of the interface
UINT n NIndex
bool b Boolean to indicate whether to break

The constructor for this struct basically copies over the values into
its own variables. Along with initializing its intenal variables.

QueryInterface() function basically delegates the call to the contained
IUnknown*.

AddRef() function first calls the contained IUnknown*’s AddRef() and
then calls its own InternalAddref() function. Which will actually
increment the counter m_dwRef and and output the value on to the debug
window.

Release() function decrements the internal reference count, outputs the
result to the debug window. After that, It calls the contained
IUnknown*’s Release(). And, If the reference count has dropped to Zero,
it will delete itself from the list maintained by CComModule.

Now comes the fun part.

You can see that the basic IUnknown services are done with the above
three functions. But, What do you do for the rest of the functions
actually provided by the interface? For e.g., IClassFactory exposes
CreateInstance() and LockServer() methods. How do these functions get
called?? First of all, It assumes that you have , at the most, 1024
methods in an interface. So, it adds STDMETHOD(f3)() to
STDMETHOD(f1024)(), the first three being QI , Addref and Release. Also
it assumes that you have virtual functions which use _stdcall calling
convention and return HRESULT, which is correct since all the methods
are complient to that anyway. The implementation for this is in
IMPL_THUNK macro.

The macro looks like this:


#define IMPL_THUNK(n)
__declspec(naked) inline HRESULT _QIThunk::f##n()
{
__asm mov eax, [esp+4]
__asm cmp dword ptr [eax+8], 0
__asm jg goodref
__asm call atlBadThunkCall
__asm goodref:
__asm mov eax, [esp+4]
__asm mov eax, dword ptr [eax+4]
__asm mov [esp+4], eax
__asm mov eax, dword ptr [eax]
__asm mov eax, dword ptr [eax+4*n]
__asm jmp eax
}

__declspec(naked) means the stack management is going to be handled by
the function itself. I.e., compiler should leave this function alone as
far as stack clearing is concerned. See the documentation for more
details about how these calling conventions work.

Within the function, first it checks the validity of the "this"
pointer. If its null, it asserts in atlBadThunkCall() function.

__asm mov eax, [esp+4]
__asm cmp dword ptr [eax+8], 0
__asm jg goodref
__asm call atlBadThunkCall

It copies the contents of [stack pointer + 4]onto eax register (*this*
pointer).


__asm mov eax, [esp+4]

Then, It gets thethis->pUnk value (by getting the contents of this+4
memory). This value is moved to the this pointer. Essentially,
replacing the pointer which was pointing to _QIThunk to point to pUnk.


__asm mov eax, dword ptr [eax+4]
__asm mov [esp+4], eax

Then the offset to the current function pointer is calculated This+
4*position of the function. So, now essentially, eax register is
pointing at the right function pointer. Then, it does a jmp to it so
that the normal execution of the actual function call can proceed.


__asm mov eax, dword ptr [eax]
__asm mov eax, dword ptr [eax+4*n]
__asm jmp eax

So, this is how it basically jumps from the _QIThunk::fXX() to the
right function provided by your function. This means, your applications
might not work if they have [local] methods if they dont comply to the
right calling convention and to the right signature.

Now that you know what _QIThunk is , lets go ahead and see how it is
used.

Usage:

_QIThunk is essentially managed by CComModule, the omnipresent handler
of all things good and bad in ATL < g>.

CComModule class is created which stays around for the lifetime of the
server. This module has a lot of functions including Init(), the place
where it all begins and some functions to manage thunks ( AddThunk,
DeleteThunk etc.).Within Init() function a simple array of _QIThunk* is
created. It actually uses CSimpleArray class, an internal ATL class
which just stores simple datatypes.


HRESULT AddThunk(IUnknown** pp, LPCTSTR lpsz, REFIID iid)
HRESULT AddNonAddRefThunk(IUnknown* p, LPCTSTR lpsz, IUnknown** pThunkRet)
void DeleteNonAddRefThunk(IUnknown* pUnk)
void DeleteThunk(_QIThunk* p)
bool DumpLeakedThunks()

are the functions exposed by CComModule to manage QIThunks.

AddThunk: This function is called by almost all the
CComObjectRootBase::InternalQueryInterface() function and CComAggObject
and CComPolyObject. This basically creates a new _QIThunk structure,
initializes it with the right IUnknown* pointer and adds it to the
array of _QIThunk pointers. This also modifies the IUnknown** to point
to _QIThunk pointer created now. This will ultimately be deleted by
either by explicitly calling DeleteThunk() or while going out of scope,
_QIThunk will call _Module.DeleteThunk(this).

AddNonAddRefThunk:

This
function is used by BEGIN_COM_MAP macro internally to return the NonAddrefed
pointer. This does not affect the actual IUnknown* but instead give
it back via pThunkRet argument. This kind of thunk is deleted
via DeleteNonAddrefThunk() function call.

 DeleteNonAddRefThunk: This function
deletes the entry from the array. It searches the array for the pUnk as passed
and deletes it from the array.

DeleteThunk:

This function deletes the _QIThunk pointer as passed
in.

DumpLeakedThunks: This function goes thru the
list of the _QIThunks (maintained in CComModule) and dumps the result on to the
debug window.

These functions are never used directly by you. They are internal to
the ATL code itself. There are some fun things you can do with this
code though.

1. You can set the m_nIndexBreakAt value of the CComModule variable to
break the execution at that QI. For example, if you want to break on
3rd query to your object, override the CComModule::Init() and set
m_nIndexBreakAt = 3. That will break at the 3rd query to your object.

2. You can get the currently available (at least once asked for)
IUnknown pointers and their attributes (like the readable name, iid of
the interface etc) from the simple array.

3. You can use CSimpleArray and CSimpleMap classes in your projects.
They are really lightweight. They dont have a lot of stuff STL
provides. But, it will suffice if you dont need the complexity.

4. You can use this patching mechanism in your own classes if you want
to do some validation before call is made.

More by Author

Must Read