Method Call Interception (MCI) in C++

Introduction

Method Call Interception (MCI) is the technique of intercepting methods, and performing certain extra, specified, operations before / instead of / after the called method. MCI while not the same as Aspect Oriented Programming is the most common technique, used to implement AOP. As such MCI is often used for

  1. Tracing
  2. Code Profiling
  3. Transaction management
  4. Thread safety (locking)

By going in for MCI, one can re-factor code by separating out core business logic, from infrastructure goop. There are several interesting articles / blogs / papers on AOP and MCI, a few of which are

  1. Semantics of method call interception (http://homepages.cwi.nl/~ralf/smci/)
  2. XEROX Parc SDA (http://www2.parc.com/csl/groups/sda/publications.shtml)
  3. AOP != Interception (http://www.neward.net/ted/weblog/index.jsp?date=20030107)
  4. AOSD Homepage (www.aosd.net)
  5. AOP with .Net (http://www.c-sharpcorner.com/Code/2002/Nov/aop.asp)

All these articles / blogs / papers while discussing AOP and MCI implementations, do so in the context of interpreted languages, supported by beefy runtimes. And this is because implementing MCI for an interpreted language is trivial (interpreted code is usually loaded into data pages, and can be re-written at run time to include forwarding calls).

However, when it comes to compiled code from a programming language like C++, things are not so easy, because compiled code is moved into code pages, which cannot be written. So, if the call is to be intercepted, the source code must be changed, to include a call before and after the method. This technique is discussed at bAspect Oriented programming and C++ at

(http://www.ddj.com/documents/s=9220/ddj0408h/0408h.html).

While this method is definitely straight forward, it also requires that you modify existing code in order to fit in MCI. Which I feel once again violates one of the basic principles of AOP - core logic code should not be mixed with infrastructure goop.

An alternate Solution

An alternate approach would be to make use of compiler specific extensions /switches -in the case of VC++, the /GH and /Gh switches. More information on these switches can be found at

(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore/html/_core_.2f.gh.asp) and

(http://msdn2.microsoft.com/en-US/library/xc11y76y.aspx)

The approach basically, is to generate function prologues and epilogues and bake in a call to _penter and _pexit. Within these two functions, we make use of the symbol debugging APIs present in dbghlp.dll to get details about the function, and then forward it to any interested listeners. (dbghelp.dll is provided by Microsoft as part of the debugging tools for widows. This is available at

http://www.microsoft.com/whdc/devtools/debugging/default.mspx

for free)

Problem Statement

Quickly summarizing our objectives - Given a particular application source (core logic code), intercept all method calls in the core logic code (with an interceptor), and forward details of the call to some arbitrary code (any interested listeners).

The problem can be split into four parts

  1. Locating and loading interested listeners
  2. Intercepting method calls
  3. Retrieving information about the called method
  4. Passing this information on to any interested listeners.

Overview of the proposed Solution

There are three main entities

  1. Core Code : The main application source
  2. Interceptor - The mechanism to intercept method calls
  3. Listener - The observer, who is to be notified of method calls

  1. Core Code : The core code need not be modified. It will however, need to be re-built using the /GH and /Gh switches, while linking dynamically to the _penter and _pexit functions exported from the Interceptor. (The /GH is not available in MS VC++ 6.0. So my solution places a thunk to make the function return to the _pexit function, after which it returns to its original return address.)
  2. Interceptor - This can be thought of as instrumentation code, which is baked into the Core Code. There are two exported methods - _penter and _pexit which are baked into the prologue / epilogue of every method in the Core Code. The interceptor is also responsible for locating, loading interested listeners, and notifying them within the _penter and _pexit functios.
  3. Listener - This is the observer (if you were considering our solution from the viewpoint of the Observer pattern). There can be any number of listeners. They are located and loaded in a chain by the interpreter. All notifications are offered to all the loaded listeners, who can then perform appropriate actions.

Implementation

There are 3 projects divided into two solutions

  1. App
    1. CoreApp
    2. Interceptor
  2. Listener
    1. Listener0

  1. CoreApp is a basic MFC based dialog application. It is compiled with the /Gh flag.
  2. Interceptor is a Win32 dll. It exports a method _penter. This dll when loaded, searches for interested listeners (named listener%d.dll in the current folder). If it finds and listener, it loads them, locates a known factory method (MCICreateListener), and uses it to create an instance of IListener for each thread spawned within Test_Stub.
  3. Listener0 is a Win32 Dll. It exports two methods MCICreateListener and MCIFreeListener. It also has an implementation of IListener (CLogListener), which when notified of a method, logs it into a .CSV file.

The interceptor is the most complicated module. Its basic functionalities are

  1. Locate all interested listeners, when it is loaded.
  2. Whenever a thread is created in CoreApp, create a separate handler for it and bind it into the threadbs TLS.
  3. Each thread handler on creation, would create a list of Ilistener objects, based on the listeners loaded during start up.
  4. Whenever a method is called in CoreApp, _penter is called. _penter in turn forwards it to the handler associated with that particular thread. The handler then iterates over all subscribed listeners for that thread, and informs them of the method.
  5. Similarly after a method executes, _pexit is called, which proceeds along the same notifications as described above.
  6. _penter makes use of the Symxxx methods present in dbghelp.dll, to load module and method names.

Problems with this approach

  1. As mentioned before, the /GH and /Gh methods are compiler specific (MS VC++ 7 and above. /Gh is present in MS VC++ 6 but not /GH.)
  2. Some amount of processor specific code needs to be written in the interceptor.
  3. Function information is retrieved using debug symbols. And the debug file format is proprietary (and heavy!). This in turn adversely affects performance.

Note:

  1. The code provided is NOT production grade code. The code (and the approach) is only meant for illustrative purposes.
  2. The debug symbols generated by the MS VC++ 7.1 compiler require the version 6.x of dbghelp.dll or higher. This is available as part of the debugging tools package provided by Microsoft. The dbghelp.dll is usually loaded from the system32 folder. If you already have an older version of dbghelp.dll there, do NOT over write it. Instead, it is better to copy the newer version into your application path. If you are using MS VC++ 6, version 5.x is sufficient.
  3. In a development machine, consider downloading symbols for your particular OS. Using the online Microsoft symbol server can be very slow at times.



About the Author

Raghupathy Srinivasan

just a run of the mill code monkey aspiring to be a grease monkey. i write articles infrequently and blog on http://quixver.blogspot.com/

Downloads