How does _ATL_DEBUG_INTERFACES work?

Have you ever wondered how this #define actually works? I did too. So, I set out to research on the behavior and the result is this document. By the way, to the uninitiated, this is a new #define introduced in VC 6.0 (ATL 3.0) that you can add to your stdafx.h before including atlbase.h so that you can get the traces on the QueryInterface calls and the reference count on your object's interfaces.

If I were to describe how this works in a paragraph, I would say something like this: "This works by introducing an interceptor class around the actual class so that all calls to the 3 IUnknown interface methods end up in this class which is then forwarded to the actual class."

Now that I have spilled the beans, let me tell you how it actually does this. If you have read "COM programmer's Cookbook" article in MSDN, you already know half the story.

Let me begin from the beginning.

First of all, Let me tell you about a struct buried in atlbase.h called _QIThunk. You need to know about this since this is the one which actually does the work of tracing the reference counting for you.

_QIThunk:

_QIThunk is a C++ struct which defines a few member variables and overriden functions QueryInterface, AddRef() and Release() and 3 to 1024 function templates. This is the core class which actually does the tracing of reference counting. This class looks like this.

#ifdef _ATL_DEBUG_INTERFACES
struct _QIThunk
{
	STDMETHOD(QueryInterface)(REFIID iid, void** pp)
	{
		ATLASSERT(m_dwRef >= 0);
		return pUnk-> QueryInterface(iid, pp);
	}
	STDMETHOD_(ULONG, AddRef)()
	{
		if (bBreak)
			DebugBreak();
		pUnk-> AddRef();
		return InternalAddRef();
	}
	ULONG InternalAddRef()
	{
		if (bBreak)
			DebugBreak();
		ATLASSERT(m_dwRef  > = 0);
		long l = InterlockedIncrement(&m_dwRef);
		ATLTRACE(_T("%d >  "), m_dwRef);
		AtlDumpIID(iid, lpszClassName, S_OK);
		if (l  >  m_dwMaxRef)
			m_dwMaxRef = l;
		return l;
	}
	STDMETHOD_(ULONG, Release)();

	STDMETHOD(f3)();
	STDMETHOD(f4)();
	STDMETHOD(f5)();
	.
	.
	STDMETHOD(f1024)();
	_QIThunk(IUnknown* pOrig, LPCTSTR p, const IID& i, UINT n, bool b)
	{
		lpszClassName = p;
		iid = i;
		nIndex = n;
		m_dwRef = 0;
		m_dwMaxRef = 0;
		pUnk = pOrig;
		bBreak = b;
		bNonAddRefThunk = false;
	}
	IUnknown* pUnk;
	long m_dwRef;
	long m_dwMaxRef;
	LPCTSTR lpszClassName;
	IID iid;
	UINT nIndex;
	bool bBreak;
	bool bNonAddRefThunk;
	void Dump()
	{
		TCHAR buf[256];
		if (m_dwRef != 0)
		{
			wsprintf(buf, _T("INTERFACE LEAK: RefCount = %d, MaxRefCount = %d, {Allocation = %d} "), m_dwRef, m_dwMaxRef, nIndex);
			OutputDebugString(buf);
			AtlDumpIID(iid, lpszClassName, S_OK);
		}
		else
		{
			wsprintf(buf, _T("NonAddRef Thunk LEAK: {Allocation = %d}\n"), nIndex);
			OutputDebugString(buf);
		}
	}
};
#endif

#ifdef _ATL_DEBUG_INTERFACES
inline ULONG _QIThunk::Release()
{
	if (bBreak)
		DebugBreak();
	ATLASSERT(m_dwRef  >  0);
	ULONG l = InterlockedDecrement(&m_dwRef);
	ATLTRACE(_T("%d < "), m_dwRef);
	AtlDumpIID(iid, lpszClassName, S_OK);
	pUnk- > Release();
	if (l == 0 && !bNonAddRefThunk)
		_pModule- > DeleteThunk(this);
	return l;
}
inline static void atlBadThunkCall()
{
	ATLASSERT(FALSE && "Call through deleted thunk");
}
#define IMPL_THUNK(n)\
__declspec(naked) inline HRESULT _QIThunk::f##n()\
{\
	__asm mov eax, [esp+4]\
	__asm cmp dword ptr [eax+8], 0\
	__asm jg goodref\
	__asm call atlBadThunkCall\
	__asm goodref:\
	__asm mov eax, [esp+4]\
	__asm mov eax, dword ptr [eax+4]\
	__asm mov [esp+4], eax\
	__asm mov eax, dword ptr [eax]\
	__asm mov eax, dword ptr [eax+4*n]\
	__asm jmp eax\
}

IMPL_THUNK(3)
IMPL_THUNK(4)
IMPL_THUNK(5)
.
.
IMPL_THUNK(1024)

#endif

	

You can see from the code that, QueryInterface delegates it to the contained interface pointer. But, AddRef() and Release() do more than delegate. They implement their own reference counter with m_dwRef and use that to put out those neat little traces on the interface' AddRef and Release. _QIThunk's c'tor basically takes five arguments.

 

Argument What is it..
IUnknown* pOrig This is actual Interface whose reference count we are interested in.
LPCTSTR p This is the name of the Interface. (for eg. IFoo)
const IID& i IID of the interface
UINT n NIndex
bool b Boolean to indicate whether to break

The constructor for this struct basically copies over the values into its own variables. Along with initializing its intenal variables.

QueryInterface() function basically delegates the call to the contained IUnknown*.

AddRef() function first calls the contained IUnknown*'s AddRef() and then calls its own InternalAddref() function. Which will actually increment the counter m_dwRef and and output the value on to the debug window.

Release() function decrements the internal reference count, outputs the result to the debug window. After that, It calls the contained IUnknown*'s Release(). And, If the reference count has dropped to Zero, it will delete itself from the list maintained by CComModule.

Now comes the fun part.

You can see that the basic IUnknown services are done with the above three functions. But, What do you do for the rest of the functions actually provided by the interface? For e.g., IClassFactory exposes CreateInstance() and LockServer() methods. How do these functions get called?? First of all, It assumes that you have , at the most, 1024 methods in an interface. So, it adds STDMETHOD(f3)() to STDMETHOD(f1024)(), the first three being QI , Addref and Release. Also it assumes that you have virtual functions which use _stdcall calling convention and return HRESULT, which is correct since all the methods are complient to that anyway. The implementation for this is in IMPL_THUNK macro.

The macro looks like this:


#define IMPL_THUNK(n)\
__declspec(naked) inline HRESULT _QIThunk::f##n()\
{\
	__asm mov eax, [esp+4]\
	__asm cmp dword ptr [eax+8], 0\
	__asm jg goodref\
	__asm call atlBadThunkCall\
	__asm goodref:\
	__asm mov eax, [esp+4]\
	__asm mov eax, dword ptr [eax+4]\
	__asm mov [esp+4], eax\
	__asm mov eax, dword ptr [eax]\
	__asm mov eax, dword ptr [eax+4*n]\
	__asm jmp eax\
}

__declspec(naked) means the stack management is going to be handled by the function itself. I.e., compiler should leave this function alone as far as stack clearing is concerned. See the documentation for more details about how these calling conventions work.

Within the function, first it checks the validity of the "this" pointer. If its null, it asserts in atlBadThunkCall() function.



	__asm mov eax, [esp+4]\
	__asm cmp dword ptr [eax+8], 0\
	__asm jg goodref\
	__asm call atlBadThunkCall\

It copies the contents of [stack pointer + 4]onto eax register (*this* pointer).


	__asm mov eax, [esp+4]\

Then, It gets thethis->pUnk value (by getting the contents of this+4 memory). This value is moved to the this pointer. Essentially, replacing the pointer which was pointing to _QIThunk to point to pUnk.


	__asm mov eax, dword ptr [eax+4]\
	__asm mov [esp+4], eax\

Then the offset to the current function pointer is calculated This+ 4*position of the function. So, now essentially, eax register is pointing at the right function pointer. Then, it does a jmp to it so that the normal execution of the actual function call can proceed.


	__asm mov eax, dword ptr [eax]\
	__asm mov eax, dword ptr [eax+4*n]\
	__asm jmp eax\

So, this is how it basically jumps from the _QIThunk::fXX() to the right function provided by your function. This means, your applications might not work if they have [local] methods if they dont comply to the right calling convention and to the right signature.

Now that you know what _QIThunk is , lets go ahead and see how it is used.

Usage:

_QIThunk is essentially managed by CComModule, the omnipresent handler of all things good and bad in ATL < g>.

CComModule class is created which stays around for the lifetime of the server. This module has a lot of functions including Init(), the place where it all begins and some functions to manage thunks ( AddThunk, DeleteThunk etc.).Within Init() function a simple array of _QIThunk* is created. It actually uses CSimpleArray class, an internal ATL class which just stores simple datatypes.


	HRESULT AddThunk(IUnknown** pp, LPCTSTR lpsz, REFIID iid)
	HRESULT AddNonAddRefThunk(IUnknown* p, LPCTSTR lpsz, IUnknown** pThunkRet)
	void DeleteNonAddRefThunk(IUnknown* pUnk)
	void DeleteThunk(_QIThunk* p)
	bool DumpLeakedThunks()

are the functions exposed by CComModule to manage QIThunks.

AddThunk: This function is called by almost all the CComObjectRootBase::InternalQueryInterface() function and CComAggObject and CComPolyObject. This basically creates a new _QIThunk structure, initializes it with the right IUnknown* pointer and adds it to the array of _QIThunk pointers. This also modifies the IUnknown** to point to _QIThunk pointer created now. This will ultimately be deleted by either by explicitly calling DeleteThunk() or while going out of scope, _QIThunk will call _Module.DeleteThunk(this).

AddNonAddRefThunk: This function is used by BEGIN_COM_MAP macro internally to return the NonAddrefed pointer. This does not affect the actual IUnknown* but instead give it back via pThunkRet argument. This kind of thunk is deleted via DeleteNonAddrefThunk() function call.

 DeleteNonAddRefThunk: This function deletes the entry from the array. It searches the array for the pUnk as passed and deletes it from the array.

DeleteThunk: This function deletes the _QIThunk pointer as passed in.

DumpLeakedThunks: This function goes thru the list of the _QIThunks (maintained in CComModule) and dumps the result on to the debug window.

These functions are never used directly by you. They are internal to the ATL code itself. There are some fun things you can do with this code though.

1. You can set the m_nIndexBreakAt value of the CComModule variable to break the execution at that QI. For example, if you want to break on 3rd query to your object, override the CComModule::Init() and set m_nIndexBreakAt = 3. That will break at the 3rd query to your object.

2. You can get the currently available (at least once asked for) IUnknown pointers and their attributes (like the readable name, iid of the interface etc) from the simple array.

3. You can use CSimpleArray and CSimpleMap classes in your projects. They are really lightweight. They dont have a lot of stuff STL provides. But, it will suffice if you dont need the complexity.

4. You can use this patching mechanism in your own classes if you want to do some validation before call is made.



Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: October 29, 2014 @ 11:00 a.m. ET / 8:00 a.m. PT Are you interested in building a cognitive application using the power of IBM Watson? Need a platform that provides speed and ease for rapidly deploying this application? Join Chris Madison, Watson Solution Architect, as he walks through the process of building a Watson powered application on IBM Bluemix. Chris will talk about the new Watson Services just released on IBM bluemix, but more importantly he will do a step by step cognitive …

  • Live Event Date: November 13, 2014 @ 2:00 p.m. ET / 11:00 a.m. PT APIs can be a great source of competitive advantage. The practice of exposing backend services as APIs has become pervasive, however their use varies widely across companies and industries. Some companies leverage APIs to create internal, operational and development efficiencies, while others use them to drive ancillary revenue channels. Many companies successfully support both public and private programs from the same API by varying levels …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds