Safe file name comparison

Did you ever need to know, wether two given filenames point to the same file in the filesystem or not ?
Since Microsoft provides UNC paths, every file may have two valid names: a short (traditional) and a long file name. OK - one could convert the long file name into a short one and compare these two names; but that will lead to two problems:

o How does one determine which of the two names is the short one and which is the long ?
o There is no API-function that will convert a long file name into a short one. There are (of course) some descriptions of how one can do this, but that's it.

Fortunataly there is another solution. The Win95/NT shell deals with PIDLs (pointers to item identifier lists) instead. Every file in the filesystem has one unique PIDL, so the idea of comparing to file names is to compare their PIDLs.
The bad point is, that this requires some knowledge of the (sucking) COM interface mechanism...

Here is the way it does:
#include "stdafx.h"
#include <shlobj.h>

// this function compares the PIDLs of two file names.
// NOTE that you cannot compare the names directly (strcmp() for
// instance), because one of the names might appear in
// long file name format and the other in short file name
// format.
// The function returns TRUE, if <pszPath1> and <pszPath2>
// name the same file in the filesystem, otherwise FALSE.
// <pszPath1> and <pszPath2> shall be absolute pathnames ...
BOOL CompareFilenames( LPCSTR pszPath1, LPCSTR pszPath2 ) {
	VERIFY(pszPath1 != 0);
	VERIFY(pszPath2 != 0);

	CoInitialize(0) ;


	LPSHELLFOLDER pDesktopFolder;

	if( SUCCEEDED( SHGetDesktopFolder(&pDesktopFolder)) ) {
		// COM-interface always needs unicode strings ...
		OLECHAR	olePath1[MAX_PATH], olePath2[MAX_PATH];
		MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, pszPath1, -1, olePath1, MAX_PATH);
		MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, pszPath2, -1, olePath2, MAX_PATH);

		// retrieve PIDLs
		LPITEMIDLIST pidl1, pidl2;
		DWORD dwAttr;
		DWORD dummy;
		if( SUCCEEDED(pDesktopFolder->ParseDisplayName(0, 0, olePath1, &dummy, &pidl1, &dwAttr)) &&
			SUCCEEDED(pDesktopFolder->ParseDisplayName(0, 0, olePath2, &dummy, &pidl2, &dwAttr)) ) {

			// now we can compare the PIDLs
			HRESULT hRes = pDesktopFolder->CompareIDs(0, pidl1, pidl2);
			if( HRESULT_CODE(hRes) == 0 )
				bRet = TRUE;

			// free the PIDLs (do not forget this !) ...
			LPMALLOC pMalloc;
			pMalloc->Free((void *)pidl1);
			pMalloc->Free((void *)pidl2);


	return bRet;


  • Yet another (easier?) way...

    Posted by Legacy on 11/17/1998 12:00am

    Originally posted by: Ken Sutherland

    Doesn't this routine do the same thing without all the nasty COM
    stuff? Is there anything wrong with this routine? Are there any
    cases which it will fail (like networked drives, etc)?

    BOOL // TRUE if the both paths point to the same file.
    LPCSTR pszPath1, // a file path
    LPCSTR pszPath2 ) // another file path
    char pszFullPath1[_MAX_PATH];
    char pszFullPath2[_MAX_PATH];
    char *pFilePart1 = NULL;
    char *pFilePart2 = NULL;

    GetFullPathName(pszPath1, _MAX_PATH, pszFullPath1, &pFilePart1);
    GetFullPathName(pszPath2, _MAX_PATH, pszFullPath2, &pFilePart2);

    BOOL bRet = FALSE;
    if(strcmp(pszFullPath1, pszFullPath2) == 0)
    bRet = TRUE;
    return bRet;

  • Another way to distinguish files on Win32

    Posted by Legacy on 11/11/1998 12:00am

    Originally posted by: Josh Baudhuin

    ... that is, without actually comparing them byte-for-byte.
    You can also use compare the data structure returned by GetFileInformationByHandle() calls for both files (or rather handles thereof). The BY_HANDLE_FILE_INFORMATION structure looks like:

    typedef struct _BY_HANDLE_FILE_INFORMATION { // bhfi
    DWORD dwFileAttributes;
    FILETIME ftCreationTime;
    FILETIME ftLastAccessTime;
    FILETIME ftLastWriteTime;
    DWORD dwVolumeSerialNumber;
    DWORD nFileSizeHigh;
    DWORD nFileSizeLow;
    DWORD nNumberOfLinks;
    DWORD nFileIndexHigh;
    DWORD nFileIndexLow;

    (from Micro$oft's Platform SDK documentation)

    Just compare the entire structure using memcmp(). The dwVolumeSerialNumber member is a fairly strong guarantee for transparency w/ UNC names, SUBST'd drives, etc.

    One scenario it >might not< handle would be if you had two drives mapped to the same volume by way of different file systems, e.g., NFS and Samba.

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: March 23, 2017 @ 2:00 p.m. ET / 11:00 a.m. PT As you adopt the use of cloud services, whether in public/IaaS, SaaS or hybrid environments, the attack surface expands and, if breached, the costs increase exponentially. This session is designed to help IT and security leaders understand and address the unique challenges that enterprises typically face when they deploy their applications in the public cloud. It summarize the areas that the public cloud vendors typically take care of and …

  • Entire organizations suffer when their networks can't keep up and new opportunities are put on hold. Waiting on service providers isn't good business. In these examples, learn how to simplify network management so that your organization can better manage costs, adapt quickly to business demands, and seize market opportunities when they arise.

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date