Determining Whether a DLL or EXE Is a Managed Component

Introduction

This article was an afterthought. There was a question asked in the CodeGuru forums dealing with a way to determine whether a given DLL/EXE is built as managed or not, and, in addition, to know the .NET framework the DLL is reliant on. All this programatically, of course. Given a DLL, I could make a guess whether it is a COM DLL or not (by querying for DllRegisterServer, DllGetClassObject, and such exports). This was much more intriguing because, unlike other kinds of FLLs/EXEs, the managed code had to run within the .NET runtime environment, and so, the loader has to know well in advance that it is a managed component and do the necessary tasks to prepare the managed code to execute. This article is a byproduct of the quest for an answer to that question and several other questions that came up in the process.

The question itself can be broken up into two parts:

  1. How do we know, given a file, that it is a .NET managed component?
  2. Given a .NET managed component, how do you know what other assemblies (indirectly the framework version) it is dependent on?

Let me deal with each of these questions in that order.

Details

This is how it all started. The first thing I attempted was to look at what the dependent DLLs were. I fired up a sample managed DLL in depends.exe (dependency walker). It's just one import from one DLL _CorDllMain from MSCOREE.dll for DLLs and _CorExeMain from MSCOREE.dll for EXEs. What about all the code for the classes the assembly exposes? If you are familiar with .NET, you will know that all this is put into the assembly as metadata and the runtime uses this metadata for the actual action. So, no exports whatsoever, no imports whatsoever. Surely, the loader is doing something. The DLL has all the information, but it is not exposed in a traditional way (exported functions). There is a level of indirection and there should be some way to get to the metadata.

The next thing that came to my mind is Microsoft's PE file format. Having played with it sometime before and having seen the loads of information in there, I decided to take a peek again at the PE structures to see whether there was anything that could take me in the right direction. (Pretty much all the basic PE file format structures are "document"ed in WinNT.H.) I opened this file, and sure enough; there was some information that could possibly be it. It was called the CLR 2.0 header structure. A little research on the Internet gave me the confidence that I was headed the right direction.

PE Header

So, basically this is it. There is a PE structure like the one below; it has headers followed by sections of data. The PE Headers has NT Headers (IMAGE_NT_HEADERS in WinNT.h) that encapsulate optional headers (IMAGE_OPTIONAL_HEADER in WinNT.h). The Optional Header has an array of IMAGE_DATA_DIRECTORY entries called DataDirectory. The IMAGE_DATA_DIRECTORY entry actually points to the actual location of the data within the PE file (called module when loaded). The entry you are interested in is at index IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR. This entry is what points to the .NET information. If this entry exists and the VirtualAddress field points to a valid area within one of the Sections following, it means the component is a managed component.

The following image shows the basic structure of how the metadata can be reached.

All this can be translated into the following code that given a path to the file. It will return TRUE if it is a managed component and FALSE otherwise.

BOOL IsManaged(LPTSTR lpszImageName)
{
   BOOL bIsManaged = FALSE;    //variable that indicates whether
                               //managed or not.
   TCHAR szPath[MAX_PATH];     //for convenience

   HANDLE hFile = CreateFile(lpszImageName, GENERIC_READ,
                             FILE_SHARE_READ,NULL,OPEN_EXISTING,
                             FILE_ATTRIBUTE_NORMAL,NULL);

   //attempt the standard paths (Windows dir and system dir) if
   //CreateFile failed in the first place.
   if(INVALID_HANDLE_VALUE == hFile)
   {
      //try to locate in Windows directory
      GetWindowsDirectory(szPath,MAX_PATH);
      _tcscat(szPath,_T("\\"));
      _tcscat(szPath,lpszImageName);
      hFile = CreateFile(szPath, GENERIC_READ, FILE_SHARE_READ,
                         NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,
                         NULL);
   }

   if(INVALID_HANDLE_VALUE == hFile)
   {
      //try to locate in system directory
      GetSystemDirectory(szPath,MAX_PATH);
      _tcscat(szPath,_T("\\"));
      _tcscat(szPath,lpszImageName);
      hFile = CreateFile(szPath, GENERIC_READ, FILE_SHARE_READ,
                         NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,
                         NULL);
   }

   if(INVALID_HANDLE_VALUE != hFile)
   {
      //succeeded
      HANDLE hOpenFileMapping = CreateFileMapping(hFile,NULL,
                                                  PAGE_READONLY,0,
                                                  0,NULL);
      if(hOpenFileMapping)
      {
         BYTE* lpBaseAddress = NULL;

         //Map the file, so it can be simply be acted on as a
         //contiguous array of bytes
         lpBaseAddress = (BYTE*)MapViewOfFile(hOpenFileMapping,
                                              FILE_MAP_READ,0,0,0);

         if(lpBaseAddress)
         {
            //having mapped the executable, now start navigating
            //through the sections

            //DOS header is straightforward. It is the topmost
            //structure in the PE file
            //i.e. the one at the lowest offset into the file
            IMAGE_DOS_HEADER* pDOSHeader =
               (IMAGE_DOS_HEADER*)lpBaseAddress;

            //the only important data in the DOS header is the
            //e_lfanew
            //the e_lfanew points to the offset of the beginning
            //of NT Headers data
            IMAGE_NT_HEADERS* pNTHeaders =
               (IMAGE_NT_HEADERS*)((BYTE*)pDOSHeader +
               pDOSHeader->e_lfanew);

            //store the section header for future use. This will
            //later be need to check to see if metadata lies within
            //the area as indicated by the section headers
            IMAGE_SECTION_HEADER* pSectionHeader =
               (IMAGE_SECTION_HEADER*)((BYTE*)pNTHeaders +
               sizeof(IMAGE_NT_HEADERS));

            //Now, start parsing
            //First of all check if it is a PE file. All assemblies
            //are PE files.
            if(pNTHeaders->Signature == IMAGE_NT_SIGNATURE)
            {
               //start parsing COM table (this is what points to
               //the metadata and other information)
               DWORD dwNETHeaderTableLocation =
                  pNTHeaders->OptionalHeader.DataDirectory
                  [IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR].
                  VirtualAddress;

               if(dwNETHeaderTableLocation)
               {
                  //.NET header data does exist for this module;
                  //find its location in one of the sections
                  IMAGE_COR20_HEADER* pNETHeader =
                     (IMAGE_COR20_HEADER*)((BYTE*)pDOSHeader +
                     GetActualAddressFromRVA(pSectionHeader,
                     pNTHeaders,dwNETHeaderTableLocation));

                  if(pNETHeader)
                  {
                     //valid address obtained. Suffice it to say,
                     //this is good enough to identify this as a
                     //valid managed component
                     bIsManaged = TRUE;
                  }
               }
            }
            else
            {
               cout << "Not PE file\r\n";
            }
            //cleanup
            UnmapViewOfFile(lpBaseAddress);
         }
         //cleanup
         CloseHandle(hOpenFileMapping);
      }
      //cleanup
      CloseHandle(hFile);
   }
   return bIsManaged;
}

That answers the first question.

Determining Whether a DLL or EXE Is a Managed Component

My search for the answer to Question 2 came from one of Matt Pietrek's article listed in the References section (Avoiding DLL Hell: Introducing Application Metadata in the Microsoft .NET Framework). The gist is this: Even though you can get to the metadata directly—the raw metadata—Microsoft hasn't exposed the structure of this metadata itself, as it has done for the PE file. Instead, Microsoft chose to provide APIs to access (get/set) the necessary files. This is what compiler vendors also would eventually use for producing .NET-compliant assemblies. Isn't that neat?

The idea is this. First, create an instance of the CLSID_CorMetaDataDispenser COM component and get a pointer to the IID_IMetaDataDispenser interface. Once you have this interface, you pass this interface the source file to read using the OpenScope method. What you are interested in is the assembly information, so you request an interface to IID_IMetaDataAssemblyImport. Once you have this, you enumerate through the assemblies referenced by this module by using IMetaDataAssemblyImport::EnumAssemblyRefs. This query, if successful, returns an array of tokens. The tokens now have to be passed in individually to IMetaDataAssemblyImport::GetAssemblyRefProps to actually get the properties for that particular dependent assemby. For your needs, the version information is available in the ASSEMBLYMETADATA structure and the name of the assemby is available with the szName parameter. For those parameters which you aren't uninterested in, you simply pass in NULL values. The code snippet below does all this, given a filename.

void PrintAssemblyInfo(LPTSTR lpszImageName)
{
   CoInitialize(NULL);
   //get the dependent assemblies
   IMetaDataDispenser* pIMetaDataDispenser = NULL;
   HRESULT hr = CoCreateInstance(CLSID_CorMetaDataDispenser, 0,
                                 CLSCTX_INPROC_SERVER,
                                 IID_IMetaDataDispenser,
                                 (LPVOID *)&pIMetaDataDispenser );
   if(SUCCEEDED(hr))
   {
      IMetaDataAssemblyImport* pIMetaDataAssemblyImport = NULL;
      hr = pIMetaDataDispenser->OpenScope(lpszImageName,ofRead,
         IID_IMetaDataAssemblyImport,(LPUNKNOWN *)
         &pIMetaDataAssemblyImport );
      if(SUCCEEDED(hr))
      {
         mdAssemblyRef   assembyRefs[5]= {0};
         ULONG numTokensOut = 0;
         ULONG numTokensIn = 5;
         HCORENUM hCoreEnum = NULL;

         //do this in a loop until we get S_FALSE
         while(1)
         {
            hr = pIMetaDataAssemblyImport->
               EnumAssemblyRefs(&hCoreEnum,assembyRefs,numTokensIn,
                                &numTokensOut);

            if(SUCCEEDED(hr))
            {
               //now enumerate
               for(int j = 0 ; j < numTokensOut; j++)
               {
                  ASSEMBLYMETADATA  aData;
                  memset(&aData,0,sizeof(ASSEMBLYMETADATA));

                  WCHAR   szName[MAX_PATH];
                  char szAssemblyName[MAX_PATH];

                  hr = pIMetaDataAssemblyImport->
                     GetAssemblyRefProps(assembyRefs[j],NULL,NULL,
                                         szName,MAX_PATH,NULL,
                                         &aData,NULL,NULL,NULL);
                  if(SUCCEEDED(hr))
                  {
                     WideCharToMultiByte(CP_ACP, 0, szName, -1,
                                         szAssemblyName,
                                         (int)wcslen(szName) + 1,
                                         NULL, NULL);
                     cout << "dependent on " << szAssemblyName
                          << " of version " << aData.usMajorVersion
                          << "." << aData.usMinorVersion << "."
                          << aData.usRevisionNumber << "."
                          << aData.usBuildNumber << endl;
                  }
               }
            }
            if(0 == numTokensOut)
               break;
         }
         if(hCoreEnum)
            pIMetaDataAssemblyImport->CloseEnum(hCoreEnum);
         pIMetaDataAssemblyImport->Release();
      }

      pIMetaDataDispenser->Release();
   }

   CoUninitialize();
}

That takes care of Question 2.

Conclusion

There is a lot of information that can be obtained by using the IMetaDataXXX interfaces. One could pretty much do all that ILDASM does by using these COM interfaces if you know what to look for. Thanks to Tischnoetentoet of CodeGuru for indirectly unraveling this for me. I am hoping this will do the same for others who read this.

The attached sample is a simple console application that, when run on the command line and passed a file name, prints out information whether it is managed or not, If it is managed, it prints out the dependent assemblies and their version numbers to the standard console output. The usage is like that shown below:

IsManaged.exe c:\windows\system32\winnt.dll

will print out:

Not a managed component

Likewise,

IsManaged.exe C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\
                 System.XML.dll

will print out:

This managed component is dependent on mscorlib of version 2.0.0.0
This managed component is dependent on System of version 2.0.0.0
This managed component is dependent on System.Data.SqlXml of
   version 2.0.0.0
This managed component is dependent on System.Configuration of
   version 2.0.0.0

References



Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • IBM Worklight is a mobile application development platform that lets you extend your business to mobile devices. It is designed to provide an open, comprehensive platform to build, run and manage HTML5, hybrid and native mobile apps.

  • In support of their business continuity and disaster recovery plans, many midsized companies endeavor to avoid putting all their eggs in one basket. Understanding the critical role of last-mile connectivity and always available Internet access for their enterprises, savvy firms utilize redundant connections from multiple service providers. Despite the good intentions, their Internet connectivity risk may still be in a single basket. That is because internet service providers (ISPs) and competitive local …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds