Safe file name comparison

WEBINAR: On-demand webcast

How to Boost Database Development Productivity on Linux, Docker, and Kubernetes with Microsoft SQL Server 2017 REGISTER >

Did you ever need to know, wether two given filenames point to the same file in the filesystem or not ?
Since Microsoft provides UNC paths, every file may have two valid names: a short (traditional) and a long file name. OK - one could convert the long file name into a short one and compare these two names; but that will lead to two problems:

o How does one determine which of the two names is the short one and which is the long ?
o There is no API-function that will convert a long file name into a short one. There are (of course) some descriptions of how one can do this, but that's it.

Fortunataly there is another solution. The Win95/NT shell deals with PIDLs (pointers to item identifier lists) instead. Every file in the filesystem has one unique PIDL, so the idea of comparing to file names is to compare their PIDLs.
The bad point is, that this requires some knowledge of the (sucking) COM interface mechanism...

Here is the way it does:
#include "stdafx.h"
#include <shlobj.h>

// this function compares the PIDLs of two file names.
// NOTE that you cannot compare the names directly (strcmp() for
// instance), because one of the names might appear in
// long file name format and the other in short file name
// format.
// The function returns TRUE, if <pszPath1> and <pszPath2>
// name the same file in the filesystem, otherwise FALSE.
//
// <pszPath1> and <pszPath2> shall be absolute pathnames ...
BOOL CompareFilenames( LPCSTR pszPath1, LPCSTR pszPath2 ) {
	VERIFY(pszPath1 != 0);
	VERIFY(pszPath2 != 0);

	CoInitialize(0) ;

	BOOL bRet = FALSE;

	LPSHELLFOLDER pDesktopFolder;

	if( SUCCEEDED( SHGetDesktopFolder(&pDesktopFolder)) ) {
		// COM-interface always needs unicode strings ...
		OLECHAR	olePath1[MAX_PATH], olePath2[MAX_PATH];
		MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, pszPath1, -1, olePath1, MAX_PATH);
		MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, pszPath2, -1, olePath2, MAX_PATH);

		// retrieve PIDLs
		LPITEMIDLIST pidl1, pidl2;
		DWORD dwAttr;
		DWORD dummy;
		if( SUCCEEDED(pDesktopFolder->ParseDisplayName(0, 0, olePath1, &dummy, &pidl1, &dwAttr)) &&
			SUCCEEDED(pDesktopFolder->ParseDisplayName(0, 0, olePath2, &dummy, &pidl2, &dwAttr)) ) {

			// now we can compare the PIDLs
			HRESULT hRes = pDesktopFolder->CompareIDs(0, pidl1, pidl2);
			if( HRESULT_CODE(hRes) == 0 )
				bRet = TRUE;

			// free the PIDLs (do not forget this !) ...
			LPMALLOC pMalloc;
			SHGetMalloc(&pMalloc);
			pMalloc->Free((void *)pidl1);
			pMalloc->Free((void *)pidl2);
			pMalloc->Release();
		}
		pDesktopFolder->Release();
	}

	CoUninitialize();

	return bRet;
}



Comments

  • Yet another (easier?) way...

    Posted by Legacy on 11/17/1998 12:00am

    Originally posted by: Ken Sutherland

    Doesn't this routine do the same thing without all the nasty COM
    
    stuff? Is there anything wrong with this routine? Are there any
    cases which it will fail (like networked drives, etc)?

    BOOL // TRUE if the both paths point to the same file.
    CompareFilenames(
    LPCSTR pszPath1, // a file path
    LPCSTR pszPath2 ) // another file path
    {
    char pszFullPath1[_MAX_PATH];
    char pszFullPath2[_MAX_PATH];
    char *pFilePart1 = NULL;
    char *pFilePart2 = NULL;

    GetFullPathName(pszPath1, _MAX_PATH, pszFullPath1, &pFilePart1);
    GetFullPathName(pszPath2, _MAX_PATH, pszFullPath2, &pFilePart2);

    BOOL bRet = FALSE;
    if(strcmp(pszFullPath1, pszFullPath2) == 0)
    bRet = TRUE;
    return bRet;
    }

    Reply
  • Another way to distinguish files on Win32

    Posted by Legacy on 11/11/1998 12:00am

    Originally posted by: Josh Baudhuin

    ... that is, without actually comparing them byte-for-byte.
    
    You can also use compare the data structure returned by GetFileInformationByHandle() calls for both files (or rather handles thereof). The BY_HANDLE_FILE_INFORMATION structure looks like:

    typedef struct _BY_HANDLE_FILE_INFORMATION { // bhfi
    DWORD dwFileAttributes;
    FILETIME ftCreationTime;
    FILETIME ftLastAccessTime;
    FILETIME ftLastWriteTime;
    DWORD dwVolumeSerialNumber;
    DWORD nFileSizeHigh;
    DWORD nFileSizeLow;
    DWORD nNumberOfLinks;
    DWORD nFileIndexHigh;
    DWORD nFileIndexLow;
    } BY_HANDLE_FILE_INFORMATION;

    (from Micro$oft's Platform SDK documentation)

    Just compare the entire structure using memcmp(). The dwVolumeSerialNumber member is a fairly strong guarantee for transparency w/ UNC names, SUBST'd drives, etc.

    One scenario it >might not< handle would be if you had two drives mapped to the same volume by way of different file systems, e.g., NFS and Samba.

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Chuze Fitness is a fast-growing fitness chain with over 21 locations spanning California, Arizona and Colorado. Chief information and marketing officer, Kris Peterson, explains why access to fast and reliable Wi-Fi is a "must have" service at their gyms and why they switched to Ruckus Cloud Wi-Fi. Chuze Fitness needed to provide a good user experience to the hundreds of guests streaming music, podcasts and videos as they worked out. They also needed to adequately cover their sprawling 20-40,000 square foot …

  • On-demand webcast Lately it seems that everywhere you turn, there's another cybersecurity breach — and hackers and thieves are never satisfied with the status quo, continuing to refine their tactics or create new methods of attack. So how do you protect your business now, but also plan for your future security needs? How can you guard against this ever-changing threat landscape? Watch Jeremy Smolik, Systems Engineer at Kaspersky Lab North America, in this on-demand webinar as we explore the biggest …

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date