Server-Side XML in ASP

By Adam S. Cartwright


With the upcoming release of Internet Explorer 5.0, it is much easier to use XML in Web applications. Here is some information on how to harness the power of the updated XML Document Object Model (DOM) on the server to parse and use XML data in ASP applications.


The Need



The ability to parse and use XML on the server provides developers with a whole new world of functionality. As the widespread use of XML increases, so does the need for manipulating XML on the server. To demonstrate server-side XML in ASP, I will use the syndicated XML version of the Scripting News, a Web site of news and commentary from the cross-platform scripting community ( http://www.scripting.com). I will show how to create a simple ASP page that displays the date of the last issue of the Scripting News and the number of headlines it contains, then I will display all of the current headlines with their corresponding URL links.

The Document Object Model



The updated XML Document Object Model in Internet Explorer 5.0 (IE 5.0) fully supports the programming interfaces described in the W3C Document Object Model Core (Level 1) recommendation. It also includes a number of new methods for supporting related XML technologies, such as XSL, XSL Pattern Matching, namespaces, data types, and schemas. The DOM is in essence an XML parser; the DOM exposes the XML document as a tree structure that is easy to navigate and use.


There are two groups of DOM programming interfaces, as defined by the W3C Core recommendation. The first group defines interfaces that are needed to write applications that use and manipulate XML documents. The second group defines interfaces to assist developers and make it easier to handle XML. This second group of interfaces is for convenience and is not essential for using XML.


Using the DOM on the server in an ASP application is quite easy, but requires that IE 5.0 be installed on the server itself. This is necessary due to the number of supporting components installed with IE. Once IE is installed on the server, all you have to do in ASP is create the DOM object as such:




<% Set objXML = Server.CreateObject(“Microsoft.XMLDOM”) %>







XML on the Server



Once you have created the DOM object on the server, you can build your own XML document or load an existing document. When loading a document, you have the option of loading a string of XML text, or opening a XML document and loading the contents. For our example, we will assume that our server has a local copy of the most recent Scripting News XML document. Before loading the document, you should set the async property of the DOM object to “false.” This tells the DOM object not to perform an asynchronous download of the XML document. This is important, since immediately after we load the document we are going to start using its contents. If the contents are not all loaded at that time, we may get an error when we try to access it.




<%
Set objXML = Server.CreateObject(“Microsoft.XMLDOM”)

objXML.async = False

objXML.Load (Server.MapPath(“mostRecentScriptingNews.xml”))
%>




Let’s look at the actual XML document that we are loading:



<?xml version=”1.0″?>

<!DOCTYPE scriptingNews SYSTEM
“http://www.scripting.com/dtd/scriptingNews.dtd”>
<scriptingNews>
<header>
<copyright>Copyright 1997-1999 UserLand Software, Inc.
</copyright>
<scriptingNewsVersion>1.0</scriptingNewsVersion>

<pubDate>Wed, 03 Mar 1999 08:00:00 GMT</pubDate>
<lastBuildDate>Thu, 04 Mar 1999 03:37:03 GMT</lastBuildDate>
</header>
<item>
<text>Wired: A Linux Car Stereo! Wow.</text>

<link>
<url>http://www.wired.com/news/news/technology/ …
story/18236.html
</url>
<linetext>A Linux Car Stereo</linetext>
</link>
</item>


<item>
<text>According to News.com, Hewlett-Packard will offer
customers storage and computing on a rental basis.
</text>
<link>
<url>http://www.news.com/News/Item/ …
0,4,33202,00.html?st.ne.fd.mdh
</url>

<linetext>According to News.com</linetext>
</link>
</item>
</scriptingNews>




The DOM object exposes a parseError object that contains information about the last parsing error. This object is extremely helpful for debugging and error handling within the ASP page. After loading the XML document, it’s a good idea to check the parseError object for any errors before continuing.



<%
If objXML.parseError.errorCode <> 0 Then
handle the error
End If
%>



Fortunately, the parseError object provides us with a lot of valuable information about the error:








errorCode

























Property

Description
The error code

filepos

The absolute file position in the XML document containing the error

Line

The line number in the XML document where the error occurred

linepos

The character position in the line containing the error

reason

The cause of the error

srcText


The data where the error occurred

URL

The URL of the XML document containing the error


In our Scripting News example, the parseError object takes on even greater meaning since the XML document is referencing a Document Type Definition (DTD) file. In this case, not only must the XML document be well formed, it must also be valid against the DTD in order to be error free. It is good practice to always check the parseError object after loading XML.


Now that we have a well-formed and valid document in our DOM object, let’s look in the document to see what we have. The DOM exposes a number of useful methods to determine exactly what is in an XML document. Because the DOM exposes the contents of the document as a tree of nested nodes (a node consists of an element and any nested subelements), we will actually end up creating a series of node objects in order to manipulate the data. We’re going to use the getElementsByTagName method to get a list of the elements, or nodes, in the document.


Our first goal is to discover the publishing date for our copy of Scripting News. By examining the DTD we know that this information is stored in the pubDate node. A simple way to access the contents of this node is first to create a node list object of all of the nodes within the XML document, then loop through it until we find the pubDate node. Because the DTD dictates that the pubDate node cannot contain any subnodes, we can use the text property to immediately pull out the contents of the node.




<%
Set objXML = Server.CreateObject(“Microsoft.XMLDOM”)
Set objLst = Server.CreateObject(“Microsoft.XMLDOM”)
objXML.async = False
objXML.Load (Server.MapPath(“mostRecentScriptingNews.xml”))
If objXML.parseError.errorCode <> 0 Then
handle the error
End If

Set objLst = objXML.getElementsByTagName(“*”)

For i = 0 to (objLst.length – 1)

If objLst.item(i).nodeName = “pubDate” Then
StrDate = objLst.item(i).text
Exit For
End If

Next
%>



Notice in the above example we passed an “*” to getElementsByTagName. This returned a node list object containing all of the elements, or nodes, in the document. Because we have the DTD and can gain from it the exact position of the pubDate node, we could have addressed it directly using its item number. However, looping through a document, as we did in the above example, is actually quite efficient since the node list is a collection.


Now that we have the publish date, let’s look at how to find the number of headlines in the document. Once again, we can draw from our knowledge of the DTD and recall that the headlines are stored in “item” nodes. There is one item node per headline in the document. We could use another loop, like we did above, and increment a counter each time we encounter an item node. However, there is a better way to retrieve this information, using another method exposed in the DOM.


As in the example above, all we need to do is create a node list object containing all of the item nodes. Then we’ll use the length property to find out how many nodes are in the node list object or, in other words, the number of headlines in the document.




<%
Set objLst = objXML.getElementsByTagName(“item”)

strNoOfHeadlines = objLst.Length
%>



Most likely we would also like to display some of this information on our ASP page. The next example shows how we could list the headlines and their URLs in our ASP page by looping through the node list of headlines.





<%
Set objXML = Server.CreateObject(“Microsoft.XMLDOM”)
Set objLst = Server.CreateObject(“Microsoft.XMLDOM”)
Set objHdl = Server.CreateObject(“Microsoft.XMLDOM”)

objXML.async = False
objXML.Load (Server.MapPath(“mostRecentScriptingNews.xml”))

If objXML.parseError.errorCode <> 0 Then
handle the error
End If

Set objLst = objXML.getElementsByTagName(“item”)

noOfHeadlines = objLst.length
%>

<HTML><BODY>
<H1>Scripting News Headlines</H1>

<%
For i = 0 To (noOfHeadlines – 1)

Set objHdl = objLst.item(i)

Response.Write(“<a href=””” & _
objHdl.childNodes(1).childNodes(0).text & _
“””>” & objHdl.childNodes(0).text & _
“</a><br>”)

Next
%>

</BODY></HTML>







Conclusion



With a little information about the structure of the XML document and by harnessing the power of the DOM, you can easily parse the XML document on the server in ASP and send whatever results you like to the client. This example is browser neutral and would work in almost all Web browsers.


Next time I will discuss how to use XSL on the server side to display complex XML documents on the client.


About the Author


Adam S. Cartwright is a software consultant in Denver, Colorado, specializing in advanced Internet technologies such as XML, ADSI, DHTML, and ASP. Adam will spoke on XML technologies at the 1999 Spring Professional ASP Developers Conference (http://asp99.nextmeeting.com).

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read