.
Introduction
Did you ever want to group a bunch of files together into a
single file for run-time read/write access but didn’t want to
bother with a file format structure for accessing the files?
There are many uses for this technology such as revision/undo
storing, dynamic access to resources, incremental updates, WAD
files etc. Microsoft’s answer to implementing these types of
solutions is to use a technology known as Compound Files (CF).
Note that Compound files are an implementation of the ActiveX
structured storage model (From MSDN article, "Containers:
Compound Files)
CFs may be viewed as a file system within a file. They allow
you to create files (known as streams) and directories (known as
sub-storages) within a single file. Compound files offer some
advantages of a database (such as transactions with rollback) and
general file system functionality. Files within the CF may be
read/written from/to incrementally just as they are within a
normal filesystem.
The Problem
Application programming to access a CF usually requires quite
a bit of manipulations of the IStorage and IStream interfaces
that are daunting to many. In addition, management of the
interfaces at the right time can cause problems if not handled
correctly.
Solutions
What is needed is another model for accessing streams and
sub-storages within a CF. A very simple model that every
application programmer is familiar with the concept of files.
Using MFC, they are managed by the CFile class. Using this model,
we can extend the class to accommodate the CFs.
This project presents the following solutions.
- An MFC CFile derived class (CStgFile) that allow simple
CFile type access to a stream within a file. - Methods for creation of CFs (CreateStg()) and the
creation of single level sub-storages (MkStg()). - An OLE Automation (COM) class ("gstg.core") for
manipulating CFs from scripting languages (and/or a
CDispatch derived MFC class). - JavaScript examples of copying files in/out of a CF.
In addition, there is additional code to provide the following
external file-system support.
- An MFC class (CScanDir) that is used to scan a
file-system directory for a file specification and return
the results in a string array. Support for overriding the
default behavior is also provided. - An OLE Automation (COM) class ("gstg.dir" ) for
accessing directory information from scripting languages
(and/or a CDispatch derived MFC class). - JavaScript examples of scanning a directory for files and
sub-directories.
And finally, an example to demonstrate the functionality:
- Copy a sub-directory of files from a file system into a
sub-storage in a CF.
Development Methodology
- The core code (CStgFile, CScanDir) is first developed as
reusable MFC classes. - They are then "wrapped" with an OLE Automation
(COM) layer that may be used by OLE Automation scripting
engines (VB, VBA, WSH, etc.) and/or other MFC application
via a CDispatch derived interface (using the TLB). - Finally, JavaScript test scripts are developed for
exercising the basic functionality of the code before
integration to a more thorough test.
Limitations
To reduce the complexities of illustrating these concepts, the
following limitations were imposed.
- CFs do not use TRANSACTED file semantics. All accesses
are DIRECT. - CF implementation limited to one level of sub-storages.
- Very little error (return codes) checking is performed in
the OLE Automation wrappers. - Methods are not "friendly" to errant
programming practices.
Examples of Usage
MFC Example
To illustrate how simple it is to use the CStgFile MFC class
for copying an external file to a newly created CF, the following
MFC code may be used.
CFile File( "tmp.tmp", CFile::modeRead ); // open a source file CStgFile FileStg; // instance the CF wrapper FileStg.CreateStg( "tmp.stg" ); // creates the storage FileStg.Open( "tmp.tmp", CFile::modeCreate | CFile::modeWrite); while( 1 ) // copy all bytes to stream { UINT cB = 0; BYTE rgB[512*8]; while( (cB = FileSrc.Read( rgB, sizeof(rgB) )) > 0 ) { FileStg.Write( rgB, cB ); } } FileStg.Close(); // close the stream FileStg.CloseStg(); // close the CF file
Notice in this example, the one call to CreateStg() converts
accesses to the file to using a CF. If this call is omitted, the
access to the object uses the normal CFile methods. This may be
useful in debugging when you wish to access the streams as normal
files.
JavaScript Example
To perform the same operation in JavaScript, the solution is
even simpler:
var objStg = WScript.CreateObject( "gstg.core" ); // create object objStg.Create( "tmp.stg" ); // create the CF objStg.CopyTo( "tmp.tmp", "tmp.tmp" ); // copy the external file objStg.Close(); // close the CF
Other Examples
Other script examples are in the scr, scr/stg and scr/dir
sub-directories. These include an example to copy a whole
directory of files from the file system into a sub-storage of a
CF (cp_bmps.js).
Conclusion
Using Compound Files becomes much easier using the CStgFile
class for accessing streams and sub-storages. There are many
other uses and advantages of using Compound Files that are beyond
the scope of this document. Please refer to MSDN for further
reading about OLE Compound Files.
Notes
- In order to run-the OLE Automation examples, you must
register the DLL[s]. - In order to run the JavaScript example, you must use the
Window Scripting Host CScript application. This is
available for download from Microsoft and comes with
Window98. - Use the DFView application from Microsoft to view CFs
created with the CStgFile class. - These classes were developed with MS Visual C++ V5.0 and
should be compatible with previous releases of the
compiler. - There are some characters that are considered invalid for
stream names (e.g. ‘!’).
Other Uses
After understanding how CFs store streams of data, other uses
become apparent:
- A Web-Site in a file.
- Resources for localization.
- BLOB type storage insertion/retrieval without the
database overhead. - Archival of data with direct access.
- Etc.
Future
The class for accessing streams within a CF should be extended
to support N levels of organization. In addition, support for
selecting TRANSACTIONs should be included.