Browsing XML/XSLT with HTA/Scripting Runtime

Introduction

The initial idea of this article was to bring you an simplistic example of using a Scripting Runtime Library, "click here and here, blah-blah-blah, thank you." I began to write because of the need to make my colleagues' and my scripts file-system-aware. This ability proved to be very useful for software prototyping purposes or for building some small utilities; of course, it shouldn't be used on the Web (due to security/privacy reasons I'll discuss later).

Back to the tool. Prior to writing it, I was deep in XML for several weeks, so what you see here is an XML/XSLT viewer/browser, enveloped in the form of an HTML Application. It helped me a lot when I was learning XML/XSL, now it aids <some other people> in rapid checking and bug tracking of large numbers of XSL templates; I hope it helps you too.

Of course, this little browser (I'll call it "Xbrowser" further on) is in no way a replacement for any enterprise-grade development tool. It is just:

  • an interactive learning tool that illustrates the basics of XML handling with JScript/MSXML—for beginners in XML development; and, maybe
  • an example of using Microsoft Scripting Runtime Object Library—for developers of office tools and solutions; and, of course
  • a simple utility for validating XML documents and viewing XSLT output

A last-minute addition to this article was a JScript batch transformation tool that uses (most of) the techniques described here.

Before using either of these utilities, please read the disclaimers within the script files; by using these utilities, you agree with them.

Requirements

For this piece of code to work properly, you'll need the following code packs:

  • The Common Dialog ActiveX Control provides a standard set of dialog boxes for operations such as opening and saving files, setting print options, and selecting colors and fonts. It is shipped with MS Visual Basic and MS Office 2000/XP products, or can be downloaded from the Microsoft Web site.
  • The Scripting Runtime Object Library is a Microsoft file system management solution, designed for use with scripting languages; it is an integral part of Microsoft Office 2000/XP. The library is also available for download at the Microsoft Web site.
  • Microsoft XML Core Services and/or SDK (versions 3.0 or 4.0). This can be downloaded from the Microsoft Web site.
  • To install some of the previously mentioned packages, you'll probably need a CAB Extraction utility. You can download it from a Microsoft site.

How Everything Works

If you look inside the attached archive, you'll see that "Xbrowser" is no more than an HTML form. You can see how to use it, and how the code works behind the scenes, step by step.

Folder browsing

Figure 1: Choose a folder where your XML files are located.

This part uses Shell object, specifically it's BrowseForFolder method.

function BrowseFolder()
{
   // Accessing the Shell:
   var objShell = new ActiveXObject("Shell.Application");

   // Calling the browser dialog:
   var objFolder = objShell.BrowseForFolder(0, "Select a folder:",
                                            0);

   if (objFolder == null) return "";

   // Accessing the folder through the FolderItem object:
   var objFolderItem = objFolder.Items().Item();
   var objPath = objFolderItem.Path;

   var foldername = objPath;
   if (foldername.substr(foldername.length-1, 1) != "\\")
      foldername = foldername + "\\";

   // foldername is the actual folder name.

   ...
}

File browsing and enumeration

Figure 2: Choose a file.

There are two interesting things here:

  • Scripting.FileSystemObject is the main point of access to the file system. In short:

    FileSystemObject contains Drives collection
    Folders collection
    Files collection

    GetDrive method (access a particular drive)
    GetFolder method (access a particular folder)
    GetFile method (access a particular file)
    Drives collection contains Item property (use to access the drive)
    Count property (number of drives in a system)
    Folders collection contains Item property (use to access the folder)
    Count property (number of folders in a collection)

    Add method (create new folder)
    Folder object contains SubFolders collection (subforders of a folder, including those with hidden and system file attributes set)
    Files collection (access all files in a folder)
    Files collection contains Item property (use to access the file)
    Count property (number of files in a folder)
    File object contains Name property (file name)
    Size property (file size)
    DateCreated property (file creation date and time)

    FSO has lots of collections, methods, and properties; I've just pointed out the most commonly used.

  • The Enumerator object is a simple iterator, used to cycle through the collection of objects:
  • Enumerator object contains item method (returns a reference to the current object in a collection)
    atEnd method (returns true if iterator has reached the end of collection)
    moveFirst method (iterates to the first object in a collection)
    moveNext method (iterates to next object in a collection)
    var fc = new Enumerator(colFiles);
    for (; !fc.atEnd(); fc.moveNext())
    {
       var objFile = fc.item();
       ...
    }
    

Actual code:

// Here goes the Scripting Runtime, FileSystemObject object:
var objFSO = new ActiveXObject("Scripting.FileSystemObject");
// Accessing the folder:
var objFolder = objFSO.GetFolder(curXMLfolder);
// Accessing the files:
var colFiles = objFolder.Files;

var xmlcount = 0, xslcount = 0;

// Cycling through the files, one by one:
var fc = new Enumerator(colFiles);
if (fc.atEnd() != true)    // If collection of files is not
{                          // empty...
   for (; !fc.atEnd(); fc.moveNext())    // Iterating through the
   {                                     // files
      var objFile = fc.item();
      var ftext = objFile.Name;

      // Checking the extension:
      if ((ftext.substr(ftext.length-3, 3)=="xml") |
          (ftext.substr(ftext.length-3, 3)=="rdf"))
      {
         xmlcount = xmlcount + 1;
         // Opening the <SELECT> tag if any XML files exist:
         if (xmlcount == 1) xmlsel="<SELECT id='xmlselection'
                                   onchange='refresh()'>";

         // Adding an option:
         xmlsel=xmlsel+"<OPTION value="+ftext+">"+ftext+
                       "</OPTION>";

         // Closing the tag:
         if (fc.atEnd()) xmlsel=xmlsel+"</SELECT>";
      }
   }
}

Loading XML from a file

This is the MSXML's part.

// Creating the new empty DOM tree:
var xml = new ActiveXObject("MSXML2.DOMDOCUMENT.4.0");
// No asynchronous load:
xml.async = false;
// Loading the file from disk:
xml.load(curXMLfolder + xmlselection.value);

Doing the same for the stylesheet:

// Creating the new empty DOM tree:
var xsl = new ActiveXObject("MSXML2.DOMDOCUMENT.4.0");
// No asynchronous load:
xsl.async = false;
// Loading the file from disk:
xsl.load(curXSLfolder + xslselection.value);

Loading XML from a string

Loading XML data from a string is a bit different from loading a file. No files, no options; all you must do is to write a string that will contain your XML code. Then, you parse that string with a single call of the LoadXML method:

// Defining a string - default stylesheet:
var defsheet="<?xml version=\"1.0\"?>";
...
defsheet=defsheet+"</xsl:stylesheet>";

// String -> DOM:
xsl.loadXML(defsheet);

Here, LoadXML is used to load a default stylesheet (hardcoded in a string), used when no XSL files are found in the appropriate folder.

Validating XML document

     

Figure 3: Review validation result.

The actual validation takes place immediately after the XML document has finished loading:

...
xml.load(curXMLfolder + xmlselection.value);
// Document is already validated; 'xml.parseError.errorCode'
// contains error code, if any.
...

So, everything you must do is to check:

if (xml.parseError.errorCode != 0)
{
   // Handle error
}
else
{
   // Proceed - XML is ok.
}

Transforming XML with XSL stylesheet

Once the XML and XSL files are loaded into DOM trees, transforming XML data with a stylesheet is as easy as nothing:

resultCache = xml.transformNode(xsl.documentElement);

Saving the result of the transformation to a file

Figure 4: Save the XSLT output to a file.

There are two points of interest here: the "Save" dialog and the file creation process itself. For the "Save" dialog to work, you must register the following ActiveX component:

<object id="cmdlg"
   classid="clsid:F9043C85-F6F2-101A-A3C9-08002B2F49FB"
   codebase="http://activex.microsoft.com/controls/vb6/
                    comdlg32.cab">
</object>

Then, you can use it:

function fileSave()
{
   cmdlg.CancelError = false;
   cmdlg.FilterIndex = 1;
   cmdlg.DialogTitle = "Save file as";
   cmdlg.Filter = "HTML file (*.html)|*.html|XML file (*.xml)|*.xml";

   // Calling the dialog:
   cmdlg.ShowSave();

   return cmdlg.FileName;
}

Wanting to save XSLT output, you simply take resultCache and stream it down to a file. No need to check for any errors here, because you don't show a "Save..." button if either of two documents (XML or XSL) hasn't passed validation.

function Save()
{
   // Asking for file name:
   var filename = fileSave();

   if (filename != "")
   {
      // Creating the file:
      var objFSO = new ActiveXObject("Scripting.FileSystemObject");
      var objFile = objFSO.CreateTextFile(filename);

      // Writing the XSLT output:
      objFile.Write(resultCache);
      objFile.Close();
   }
}

Browsing XML/XSLT with HTA/Scripting Runtime

All Inclusive

Tired of childish examples in tutorial books? Me too, and that is why I've tested Xbrowser's functionality on some real XML data. The accompanying XML file contains the NASDAQ historical price data of Sun Microsystems Inc. I've also written three XSLT stylesheets:

  • Plain. This one shows original XML data, with IE's color scheme:
  • [plain.jpg]

  • Table. This is a simple example of XSL transformation. Green/red color rows, showing increase/decrease of stock price, are the illustration of <xsl:choose> rule:
  • [table.jpg]

  • Bar graph–more complex stylesheet. This one features: "for-next"-style cycles (implemented using recursive calls of named templates); searching for maximum in a row of values (with the use of <xsl:sort> rule); and, of course, an algorithm for building stylish bars:

    [bar.jpg]

Make It Quick

The last little thing included is transfrm_batch script—a simple tool for performing numerous XSL transformations in one go. Script takes a single file name as an argument:

transfrm_batch.js batch.list

The file, passed as an argument, can contain an arbitrary number of lines (one transformation for one line) in the following format:

<xml_file_name>,<xsl_file_name>,<result_file_name>

Sample batch list:

stock.xml,plain.xsl,result1.html
stock.xml,table.xsl,result2.html
stock.xml,bargraph.xsl,result3.html

Aside from the resulting files, transfrm_batch creates a "transfrm.log" file that contains a transformation log.

Browsing XML/XSLT with HTA/Scripting Runtime

Security

Looking at the previous sections, one can guess: A script that writes arbitrary data to arbitrary files can be a big source of headaches and security problems. Moreover, HTML Applications are not subject to IE's security restrictions (see the appropriate introduction), so third-party (or just erroneous) scripts that use FileSystemObject can be a major security threat.

This dictates two primary uses of Scripting Runtime: "local" (non-Web) utilities and server-side scripting. As MSDN says, "because use of the FSO on the client side raises serious security issues about providing potentially unwelcome access to a client's local file system, this documentation assumes use of the FSO object model to create scripts executed by Internet Web pages on the server side. Since the server side is used, the Internet Explorer default security settings do not allow client-side use of the FileSystemObject object. Overriding those defaults could subject a local computer to unwelcome access to the file system, which could result in total destruction of the file system's integrity, causing loss of data, or worse."

Sometimes, the only option you can look at is turning stand-alone HTAs to corporate Web pages (simply renaming .hta file to .html and ripping the HTA:APPLICATION tag), thus using FSO at client-side. It raises problems with component licensing and execution permissions; furthermore, you must be sure that your intranet is extremely secure. In this case (in other words, if you're returning to the "ordinary Web"), to defend yourself from any unexpected behavior, and, on the other side, use the power of advanced scriptable objects (like Scripting.FileSystemObject or MSXML.DOMDocument), please consider the following:

  • Never allow non-secure and unsigned ActiveX components to run without your explicit approval. Set "Initialize and script ActiveX controls not marked as safe" options in IE's security tab to "prompt".
  • Never allow Java components to be downloaded and run without your explicit approval. Set Java Virtual Machine security level in IE's security tab to "Medium" or "High".
  • Scripts downloaded from home or the corporate intranet are usually trustworthy, so you may want to set the security level of "Local Intranet" to "Medium" or even "Low", while setting "Internet" security level to "High".
  • Add servers where your scripts reside to the "Trusted sites" zone.
Note: These are just basic rules; you should probably consult with IT professionals to build up your intranet security to the appropriate level.

Alternatives

Other downsides of "Xbrowser" are: It is strictly IE-bound, and it depends on too much external code libraries. This is the flipside of the power that the IE engine provides; however, dependencies can be reduced in a number of ways:

  • XML/XSL parsing/validating

    Sarissa—a JavaScript wrapper for native XML APIs. XSL transformations, XPath queries, parsing, and validation can be performed; all popular browsers are supported. This can help you build cross-browser XML solutions.

  • File browsing/enumeration

    To eliminate dependencies on Scripting Runtime, shell scripting can be used. For example:

  • // Instantiating 'Shell' object
    var objShell = new ActiveXObject("Shell.Application");
    // Accessing a folder
    var objFolder = objShell.NameSpace(curXMLfolder);
    var objFolderItems = objFolder.Items();
    var colFiles = objFolderItems.Count;
    
    var xmlcount = 0, xslcount = 0;
    
    if (colFiles != 0)
    {
       for (iCount = 0; iCount < colFiles; iCount++)
       {
          // Accessing a file
          var objFile = objFolderItems.item(iCount);
    
          if (objFile.IsFolder == false)
          {
             var ftext = objFile.Path;
             var fname = objFile.Name;
             var fextn = ftext.substr(ftext.length-3, 3);
    
             // File type is...
             if ((fextn == "xml") | (fextn == "rdf"))
             {
                xmlcount = xmlcount + 1;
                if (xmlcount == 1) xmlsel="<SELECT
                   id='xmlselection' onchange='refresh()'>";
    
                xmlsel=xmlsel+"<OPTION value="+fname+".
                   "+fextn+">"+fname+"."+fextn+"</OPTION>";
    
                if (iCount == (colFiles - 1)) xmlsel=xmlsel+"
                </SELECT>";
             }
          }
       }
       ...
    }
    

I/O

Unfortunately (or fortunately?), there is no standard alternative to Scripting Runtime Object Library for file I/O. Other browsers (Mozilla Firefox, Opera, and so forth) don't tolerate any deviations from the ECMAScript standard (Microsoft's JScript is the implementation of it), so you can be sure that no code is capable of tearing your file system apart.

Common Dialog Controls

Scripting

Script Security

XML

Tools

To develop some serious XML applications, you'll need something more than a notepad.

The Officials Speak

Formal technical specifications; written, as usual, in W3C's heavy language.

  • XML: home of XML- and XML-based technologies.
  • XML Schema: general info on XML schemas; links to XSD/DTD helper utilities.
  • XSL: home of the Extensible Stylesheet Language. Specs, software news and links, tutorials.

Links

The Web has more papers on XML/XSLT than any ordinary human can read in a lifetime. Here are just a few links:



Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: September 10, 2014 Modern mobile applications connect systems-of-engagement (mobile apps) with systems-of-record (traditional IT) to deliver new and innovative business value. But the lifecycle for development of mobile apps is also new and different. Emerging trends in mobile development call for faster delivery of incremental features, coupled with feedback from the users of the app "in the wild." This loop of continuous delivery and continuous feedback is how the best mobile …

  • It's hardly surprising that half of small businesses fail within the first 1-5 years. It's not easy to launch a new product, single-handedly manage everything from IT to accounting, fend off the competition, and grow a customer base – all at the same time – even with a great concept. Offering awesome customer service can make the difference between a startup that flies and a startup that dies. Read this white paper to learn nine ways customer support can help you beat the competition and grow your …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds