Safe file name comparison

Did you ever need to know, wether two given filenames point to the same file in the filesystem or not ?
Since Microsoft provides UNC paths, every file may have two valid names: a short (traditional) and a long file name. OK - one could convert the long file name into a short one and compare these two names; but that will lead to two problems:

o How does one determine which of the two names is the short one and which is the long ?
o There is no API-function that will convert a long file name into a short one. There are (of course) some descriptions of how one can do this, but that's it.

Fortunataly there is another solution. The Win95/NT shell deals with PIDLs (pointers to item identifier lists) instead. Every file in the filesystem has one unique PIDL, so the idea of comparing to file names is to compare their PIDLs.
The bad point is, that this requires some knowledge of the (sucking) COM interface mechanism...

Here is the way it does:
#include "stdafx.h"
#include <shlobj.h>

// this function compares the PIDLs of two file names.
// NOTE that you cannot compare the names directly (strcmp() for
// instance), because one of the names might appear in
// long file name format and the other in short file name
// format.
// The function returns TRUE, if <pszPath1> and <pszPath2>
// name the same file in the filesystem, otherwise FALSE.
// <pszPath1> and <pszPath2> shall be absolute pathnames ...
BOOL CompareFilenames( LPCSTR pszPath1, LPCSTR pszPath2 ) {
	VERIFY(pszPath1 != 0);
	VERIFY(pszPath2 != 0);

	CoInitialize(0) ;


	LPSHELLFOLDER pDesktopFolder;

	if( SUCCEEDED( SHGetDesktopFolder(&pDesktopFolder)) ) {
		// COM-interface always needs unicode strings ...
		OLECHAR	olePath1[MAX_PATH], olePath2[MAX_PATH];
		MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, pszPath1, -1, olePath1, MAX_PATH);
		MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, pszPath2, -1, olePath2, MAX_PATH);

		// retrieve PIDLs
		LPITEMIDLIST pidl1, pidl2;
		DWORD dwAttr;
		DWORD dummy;
		if( SUCCEEDED(pDesktopFolder->ParseDisplayName(0, 0, olePath1, &dummy, &pidl1, &dwAttr)) &&
			SUCCEEDED(pDesktopFolder->ParseDisplayName(0, 0, olePath2, &dummy, &pidl2, &dwAttr)) ) {

			// now we can compare the PIDLs
			HRESULT hRes = pDesktopFolder->CompareIDs(0, pidl1, pidl2);
			if( HRESULT_CODE(hRes) == 0 )
				bRet = TRUE;

			// free the PIDLs (do not forget this !) ...
			LPMALLOC pMalloc;
			pMalloc->Free((void *)pidl1);
			pMalloc->Free((void *)pidl2);


	return bRet;


  • Yet another (easier?) way...

    Posted by Legacy on 11/17/1998 12:00am

    Originally posted by: Ken Sutherland

    Doesn't this routine do the same thing without all the nasty COM
    stuff? Is there anything wrong with this routine? Are there any
    cases which it will fail (like networked drives, etc)?

    BOOL // TRUE if the both paths point to the same file.
    LPCSTR pszPath1, // a file path
    LPCSTR pszPath2 ) // another file path
    char pszFullPath1[_MAX_PATH];
    char pszFullPath2[_MAX_PATH];
    char *pFilePart1 = NULL;
    char *pFilePart2 = NULL;

    GetFullPathName(pszPath1, _MAX_PATH, pszFullPath1, &pFilePart1);
    GetFullPathName(pszPath2, _MAX_PATH, pszFullPath2, &pFilePart2);

    BOOL bRet = FALSE;
    if(strcmp(pszFullPath1, pszFullPath2) == 0)
    bRet = TRUE;
    return bRet;

  • Another way to distinguish files on Win32

    Posted by Legacy on 11/11/1998 12:00am

    Originally posted by: Josh Baudhuin

    ... that is, without actually comparing them byte-for-byte.
    You can also use compare the data structure returned by GetFileInformationByHandle() calls for both files (or rather handles thereof). The BY_HANDLE_FILE_INFORMATION structure looks like:

    typedef struct _BY_HANDLE_FILE_INFORMATION { // bhfi
    DWORD dwFileAttributes;
    FILETIME ftCreationTime;
    FILETIME ftLastAccessTime;
    FILETIME ftLastWriteTime;
    DWORD dwVolumeSerialNumber;
    DWORD nFileSizeHigh;
    DWORD nFileSizeLow;
    DWORD nNumberOfLinks;
    DWORD nFileIndexHigh;
    DWORD nFileIndexLow;

    (from Micro$oft's Platform SDK documentation)

    Just compare the entire structure using memcmp(). The dwVolumeSerialNumber member is a fairly strong guarantee for transparency w/ UNC names, SUBST'd drives, etc.

    One scenario it >might not< handle would be if you had two drives mapped to the same volume by way of different file systems, e.g., NFS and Samba.

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Old Habits Die Hard: The report, which is in its fifth year, polled 1,100 senior IT security executives at large enterprises around the world and indicates an ongoing disconnect between the security solutions organizations spend money on and the ability of those solutions to protect sensitive data. While 30 percent of respondents classify their organizations as 'very vulnerable' or 'extremely vulnerable' to data attacks (and the number of breaches continues to rise) the two top spending priorities are network …

  • Enterprise cloud adoption has evolved rapidly from fringe curiosity to the mainstream. As enterprises increasingly move mission-critical workloads to the cloud, it's important to track best practices to ensure a seamless migration process. While CIOs are becoming increasingly mature and pragmatic in their approach to cloud, surprises and challenges still need to be addressed. Read this eBook to learn the key best practices for cloud deployment success, the importance of SLAs in choosing a cloud provider, and …

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date