XPath Support in Browsers

The following article is an excerpt from Professional JavaScript for Web Developers, by Nicholas C. Zakas, published by Wrox Publications. Reprinted with permisison of the publisher.

Because XML is used for so many kinds of data, it became necessary to create a means to locate data inside of XML code. The answer to this problem is XPath, which is a small language used specifically to locate a single node or multiple nodes that match a particular pattern.

Introduction to XPath

Every XPath expression has two parts: a context node and a node pattern. The context node provides the context from which the node pattern should begin. The node pattern is a string made up of one or more node selectors.

For instance, consider the following XML document:

<?xml version="1.0"?>
<employees>

    <employee title="Software Engineer">
        <name>Nicholas C. Zakas</name>
    </employee>
    <employee title="Salesperson">
        <name>Jim Smith</name>

    </employee>
</employees>

And consider this XPath expression:

employee/name

If the context node is <employees/>, then the previous XPath expression matches both <name>Nicholas C. Zakas</name> and <name>Jim Smith</name>. In the expression, both employee and name refer to tag names of XML elements in the order in which they appear from the context node; the slash indicates a parent-to-child relationship. In essence, the XPath expression says, "Starting from <employees/>, match any <name/> elements located under any <employee/> element that is a child of the reference node."

To select only the first <employee/> element's <name/> element, the XPath expression is the following:

employee[position() = 1]/name

In XPath, the square brackets notation is used to provide more specific information about an element. This example uses the XPath position() function, which returns the element's position under its parent element. The first child node is in position 1, so comparing position() to 1 matches only the first <employee/> element. Then, the slash and name match the <name/> element under that first <employee/> element.

You can use a variety of ways to match elements in addition to their names and positions. Suppose you want to select all <employee/> elements with the title attribute equal to "Salesperson", the XPath expression would be the following:

employee[@title = "Salesperson"]

In this expression, the @ symbol is short for attribute.

XPath is a very powerful expression that can make finding specific nodes within a DOM Document much easier. Because of this, both IE and Firefox made sure to include XPath support in their DOM implementations.

If you'd like to learn more about XPath, consider picking up XPath 2.0: Programmer's Reference (Wrox., ISBN 0-7645-6910-4).

XPath support in IE

Microsoft saw fit to build XPath support right into the XML DOM object. Each node has two methods that can be used to retrieve nodes matching an XPath pattern: selectNodes(), which returns a collection of nodes matching a pattern, and selectSingleNode(), which returns the first node that matches a given pattern.

Using the same data as the previous section, you can select all <name/> elements that are children of an <employee/> element by using the following code:

var lstNodes = oXmlDom.documentElement.selectNodes("employee/name");

Because selectNodes() is called as a method of oXmlDom.documentElement, the document element is considered the context node for the XPath expression. The method returns a NodeList containing all elements that match the given pattern, meaning that you can iterate through the elements like so:

for (var i=0; i < lstNodes.length; i++) {
    alert(lstNodes[i]);
}

Even if there are no matches to a given pattern, a NodeList is still returned. If it is empty, its length property is equal to 0.

The result of selectNodes() is a living list. So, if you update the document with another element that matches the XPath expression, that element is automatically added to the NodeList in the appropriate position.

If you want only the first element matching the pattern, then selectSingleNode() is the method to use:

var oElement = oXmlDom.documentElement.selectSingleNode("employee/name");

The selectSingleNode() method returns an Element as the function value if found, otherwise it returns null.

XPath Support in Browsers

XPath support in Firefox

As you may have guessed, Firefox supports the XPath according to the DOM standard. A DOM Level 3 addition called DOM Level 3 XPath defines interfaces to use for evaluating XPath expressions in the DOM. Unfortunately, this standard is more complicated than Microsoft's fairly straightforward approach.

Although a handful of XPath-related objects exist, the two most important ones are XPathEvaluator and XPathResult. An XPathEvaluator is used to evaluate an XPath expression with a method named, appropriately enough, evaluate().

The evaluate() method takes five arguments: the XPath expression, the context node, a namespace resolver, the type of result to return, and an XPathResult object to fill with the result (usually null). The third argument, the namespace resolver, is necessary only when the XML code uses an XML namespace, and so typically is left as null. The fourth argument, the type of result to return, is one of 10 constants values:

  • XPathResult.ANY_TYPE — Returns the type of data appropriate for the XPath expression
  • XPathResult.ANY_UNORDERED_NODE_TYPE — Returns a node set of matching nodes, although the order may not match the order of the nodes within the document
  • XPathResult.BOOLEAN_TYPE — Returns a Boolean value
  • XPathResult.FIRST_ORDERED_NODE_TYPE — Returns a node set with only one node, which is the first matching node in the document
  • XPathResult.NUMBER_TYPE — Returns a number value
  • XPathResult.ORDERED_NODE_ITERATOR_TYPE — Returns a node set of matching nodes in the order in which they appear in the document. This is the most commonly used result type.
  • XPathResult.ORDERED_NODE_SNAPSHOT_TYPE — Returns a node set snapshot, capturing the nodes outside of the document so that any further document modification doesn't affect the node list. The nodes in the node set are in the same order as they appear in the document.
  • XPathResult.STRING_TYPE — Returns a string value
  • XPathResult.UNORDERED_NODE_ITERATOR_TYPE — Returns a node set of matching nodes, although the order may not match the order of the nodes within the document
  • XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE — Returns a node set snapshot, capturing the nodes outside of the document so that any further document modification doesn't affect the node set. The nodes in the node set are not necessarily in the same order as they appear in the document.

The type of result you specify determines how to retrieve the value of the result. Here's a typical example:

var oEvaluator = new XPathEvaluator();
var oResult = oEvaluator.evaluate("employee/name", 
oXmlDom.documentElement, null, 
        XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);
        if (oResult != null) {
    var oElement = oResult.iterateNext();
    while(oElement) {
        alert(oElement.tagName);
        oElement = oResult.iterateNext();
    }
}

This example uses the XPathResult.ORDERED_NODE_ITERATOR_TYPE result, which is the most commonly used result type. If no nodes match the XPath expression, evaluate() returns null; otherwise, it returns an XPathResult object. If the result is a node iterator, whether it be ordered or unordered, you use the iterateNext() method repeatedly to retrieve each matching node in the result. When there are no further matching nodes, iterateNext() returns null. Using a node iterator, it's possible to create a selectNodes() method for Firefox:

Element.prototype.selectNodes = function (sXPath) [
    var oEvaluator = new XPathEvaluator();
    var oResult = oEvaluator.evaluate(sXPath, this, null, 
      XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);    

        var aNodes = new Array;
    
    if (oResult != null) {
        var oElement = oResult.iterateNext();
        while(oElement) {
            aNodes.push(oElement);
            oElement = oResult.iterateNext();
        }
    }
    
    return aNodes;};

The selectNodes() method is added to the Element class to mimic the behavior in IE. When evaluate() is called, it uses the this keyword as the context node (which is also how IE works). Then, a result array (aNodes) is filled with all the matching nodes. You can use this new method like so:

var aNodes = oXmlDom.documentElement.selectNodes("employee/name");
for (var i=0; i < aNodes.length; i++) {
    alert(aNodes[i].xml);
}

If you specify a snapshot result type (either ordered or unordered), you must use the snapshotItem() and snapshotLength() methods, as in the following example:

var oEvaluator = new XPathEvaluator();
var oResult = oEvaluator.evaluate("employee/name", 
        oXmlDom.documentElement, null, 
    XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
        if (oResult != null) {
    for (var i=0; i < oResult.snapshotLength; i++) {
        alert(oResult.snapshotItem(i).tagName);
    }
}

In this example, snapshotLength returns the number of nodes in the snapshot and snapshotItem() returns the node in a given position in the snapshot (similar to length and item() in a NodeList).

The XPathResult.FIRST_ORDERED_NODE_TYPE result returns the first matching node, which is accessible through the singleNodeValue property:

var oEvaluator = new XPathEvaluator();
var oResult = oEvaluator.evaluate("employee/name", 
oXmlDom.documentElement, null, 
XPathResult.FIRST_ORDERED_NODE_TYPE, null);
alert(oResult.singleNodeValue.xml);

As you may have guessed, this code can be used to mimic the IE selectSingleNode() method:

Element.prototype.selectSingleNode = function (sXPath) {
    var oEvaluator = new XPathEvaluator();
    var oResult = oEvaluator.evaluate(sXPath, this, null, 
                XPathResult.FIRST_ORDERED_NODE_TYPE, null);
        if (oResult != null) {
        return oResult.singleNodeValue;
    } else {
        return null;
    }
}

This method can then be used the same as the one in IE:

var oNode = oXmlDom.documentElement.selectSingleNode("employee/name");
alert(oNode);

The last section of XPathResult types are the Boolean type, number type, and string type. Each of these result types returns a single value using the booleanValue, numberValue, and stringValue properties, respectively. For the Boolean type, the evaluation typically returns true if at least one node matches the XPath expression and returns false otherwise:

var oEvaluator = new XPathEvaluator();
var oResult = oEvaluator.evaluate("employee/name", 
        oXmlDom.documentElement, null, 
        XPathResult.BOOLEAN_TYPE, null);
        alert(oResult.booleanValue);

In this example, if any nodes match "employee/name", the booleanValue property is equal to true.

For the number type, the XPath expression must use an XPath function that returns a number, such as count(), which counts all the nodes that match a given pattern:

var oEvaluator = new XPathEvaluator();
var oResult = oEvaluator.evaluate("count(employee/name)",
        oXmlDom.documentElement, 
   null, XPathResult.BOOLEAN_TYPE, null);
        alert(oResult.numberValue);

This code outputs the number of nodes that match "employee/name" (which is 2). If you try using this method without one of the special XPath functions, numberValue is equal to NaN.

For the string type, the evaluate() method finds the first node matching the XPath expression, then returns the value of the first child node, assuming the first child node is a text node. If not, the result is an empty string. Here's an example:

var oEvaluator = new XPathEvaluator();
var oResult = oEvaluator.evaluate("employee/name", 
        oXmlDom.documentElement, null, 
    XPathResult.STRING_TYPE, null);
        alert(oResult.stringValue);

The previous code outputs "Nicholas C. Zakas", because that is the first text node in the first <name/> element under an <employee/> element.

If you feel like living dangerously, you can use the XPathResult.ANY_TYPE. By specifying this result type, evaluate() returns the most appropriate result type based on the XPath expression. Typically, this result type is a Boolean value, number value, string value, or an unordered node iterator. To determine which result type has been returned use the resultType property:

var oEvaluator = new XPathEvaluator();
var oResult = oEvaluator.evaluate("employee/name", 
        oXmlDom.documentElement, null, 
        XPathResult.STRING_TYPE, null);if (oResult != null) {
    switch(oResult.resultType) {
        case XPath.STRING_TYPE:
            //handle string type
            break;
        case XPath.NUMBER_TYPE:
            //handle number type
            break;
        case XPath.BOOLEAN_TYPE:
            //handle boolean type
            break;
        case XPath.UNORDERED_NODE_ITERATOR_TYPE:
            //handle unordered node iterator type
            break;
        default:
            //handle other possible result types    }
}

As you can tell, XPath evaluation in Firefox is much more complicated than IE, but also much more powerful. By using the custom selectNodes() and selectSingleNode() methods, you can perform XPath evaluation in both browsers using the same code.

About the Author

Nicholas C. Zakas has worked in web development for more than five years. He has helped develop web solutions in use at some of the largest companies in the world. Nicholas is also an active blogger on JavaScript and Ajax topics at http://www.nczonline.net/.

Reprinted with permission of the publisher.



About the Author

Nicholas C. Zakas

Nicholas C. Zakas is the lead author of Professional Ajax by (Wrox, 2006, ISBN: 0-471-77778-1) and the author of Professional JavaScript for Web Developers (Wrox, 2005, ISBN: 0-7645-7908-8). He has worked in web development for more than five years. He has helped develop web solutions in use at some of the largest companies in the world. Nicholas is also an active blogger on JavaScript and Ajax topics at http://www.nczonline.net/.

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Managing your company's financials is the backbone of your business and is vital to the long-term health and viability of your company. To continue applying the necessary financial rigor to support rapid growth, the accounting department needs the right tools to most efficiently do their job. Read this white paper to understand the 10 essentials of a complete financial management system and how the right solution can help you keep up with the rapidly changing business world.

  • Agile development principles have gone from something used only by cutting-edge teams to a mainstream approach used by teams large and small. If you're not using agile methods already though, or if you've only been exposed to agile on small projects here and there, you may wonder if agile can ever work in your environment. Read this eBook to learn the fundamentals of agile and how to increase the productivity of your software teams while enabling them to produce higher-quality solutions that better fulfill …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds