Generating Custom XML from SQL 2005

There are a number of ways to programmatically generate custom XML from a database. Before reading Professional ASP.NET 2.0 Special Edition, you probably used FOR XML AUTO to generate fairly basic XML and then modified the XML in post processing to meet your needs. This was formerly a very common pattern. FOR XML AUTO was fantastically easy; and FOR XML EXPLICIT, a more explicit way to generate XML, was very nearly impossible to use.

SQL Server 2005 adds the new PATH method to FOR XML that makes arbitrary XML creation available to mere mortals. SQL 2005's XML support features very intuitive syntax and very clean namespace handling.

Here is an example of a query that returns custom XML. The WITH XMLNAMESPACS commands at the start of the query set the stage by defining a default namespace and using column-style name aliasing to associate namespaces with namespace prefixes. In this example addr: is the prefix for urn:hanselman.com/northwind/address.

use Northwind;
WITH XMLNAMESPACES (
 DEFAULT 'urn:hanselman.com/northwind'
 , 'urn:hanselman.com/northwind/address' as "addr"
)
SELECT CustomerID as "@ID", 
       CompanyName,
       Address as "addr:Address/addr:Street",
       City as "addr:Address/addr:City",
       Region as "addr:Address/addr:Region",
       PostalCode as "addr:Address/addr:Zip",
       Country as "addr:Address/addr:Country",
       ContactName as "Contact/Name",
       ContactTitle as "Contact/Title",
       Phone as "Contact/Phone", 
       Fax as "Contact/Fax"
FROM Customers
FOR XML PATH('Customer'), ROOT('Customers'), ELEMENTS XSINIL

The aliases using the AS keyword declaratively describe the elements and their nesting relationships, whereas the PATH keyword defines an element for the Customers table. The ROOT keyword defines the root node of the document.

The ELEMENTS keyword, along with XSINIL, describes how you handle null. Without these keywords, no XML element is created for a row's column that contains null; this absence of data in the database causes the omission of data in the resulting XML document. When the ELMENTS XSINIL combination is present, an element outputs using an explicit xsi:nil syntax such as <addr:Region xsi:nil="true" />.

When you run the example, SQL 2005 outputs an XML document like the one that follows. Note the namespaces and prefixes are just as you defined them.

<Customers xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
           xmlns:addr="urn:hanselman.com/northwind/address" 
           xmlns="urn:hanselman.com/northwind">
  <Customer ID="ALFKI">
    <CompanyName>Alfreds Futterkiste</CompanyName>
    <addr:Address>
      <addr:Street>Obere Str. 57</addr:Street>
      <addr:City>Berlin</addr:City>
      <addr:Region xsi:nil="true" />
      <addr:Zip>12209</addr:Zip>
      <addr:Country>Germany</addr:Country>
    </addr:Address>
    <Contact>
      <Name>Maria Anders</Name>
      <Title>Sales Representative</Title>
      <Phone>030-0074321</Phone>
      <Fax>030-0076545</Fax>
    </Contact>
  </Customer>
...the rest of the document removed for brevity...

The resulting XML can now be manipulated using an XmlReader or any of the techniques covered in Chapter 13, "Working with XML" of Professional ASP.NET 2.0 Special Edition (Wrox, 2006, ISBN: 0470041781).

Generating Custom XML from SQL 2005

Adding a Column of Untyped XML

SQL Server can produce XML from a query, and it now can also store XML in a single column. Because XML is a first-class data type within SQL Server 2005, adding a new column to the Customers table of the Northwind Database is straightforward. You can use any SQL Server management tool you like. I use the SQL Server Management Studio Express, a free download that can be used with any SQL SKU (including the free SQL Express 2005). Bring up your Query Analyzer or Management Studio Express and, with the Northwind database selected, execute the following query.

use Northwind;
BEGIN TRANSACTION
GO
ALTER TABLE dbo.Customers ADD
 Notes xml NULL
GO
COMMIT

Note the xml type keyword after Notes in the preceding example. If an XML Schema were already added to the database, you could add this new column and associate it with a named Schema Collection all at once using this syntax.

use Northwind;
BEGIN TRANSACTION
GO
ALTER TABLE dbo.Customers ADD
 Notes xml(DOCUMENT dbo.NorthwindCollection)
 GO
COMMIT

Here the word DOCUMENT indicates that the column will contain a complete XML document. Use CONTENT to store fragments of XML that don't contain a root note. You haven't added a schema yet, so that's the next step. So far, you've added a Notes column to the Customers table that can be populated with prose. For example, a customer service representative could use it to describe interactions she's had with the customer, entering text into a theoretical management system.

Adding an XML Schema

Although the user could store untyped XML data in the Notes field, you should really include some constraints on what's allowed. XML data can be stored typed or untyped, as a fragment with no root note or as a document. Because you want store Customer interaction data entered and viewed from a Web site, ostensibly containing prose, XHTML is a good choice.

XML data is validated by XML Schemas, as I say earlier in the chapter. However, SQL Server 2005 is a database, not a file system. It needs to store the schemas you want to reference in a location it can get to. You add a schema or schemas to SQL Server 2005 using queries formed like this:

CREATE XML SCHEMA COLLECTION 
	YourCollection AS 'your complete xml schema here'

You'll be using the XHTML 1.0 Strict schema located on the W3C Web site shown here: w3.org/TR/xhtml1-schema/#xhtml1-strict. Copy the entire schema to a file, or download the schema directly from w3.org/2002/08/xhtml/xhtml1-strict.xsd.

When executing your query, you include the entire XSD inline in your schema. However, you should watch for few things. First, escape any single quotes so that ' becomes ''-that is, two single quotes, NOT one double-using a search and replace. Second, because SQL 2005 uses the MSXML6 XML parser to parse its XML, take into consideration a limitation in that parser. MSXML6 already has the xml: namespace prefix and associated namespace hard-coded internally, so you should remove the line from your schema that contains that namespace. This little oddity is documented, but buried within MSDN at http://msdn2.microsoft.com/ms177489(en-US,SQL.90).aspx and only applies to a few predefined schemas like this one that uses the xml: prefix and/or the http://www.w3.org/XML/1998/namespace namespace. In the fragment that follows, I've boldfaced the line you need to remove.

Use Northwind;
CREATE XML SCHEMA COLLECTION NorthwindCollection AS
'<?xml version="1.0" encoding="UTF-8"?>
<xs:schema version="1.0" xml:lang="en"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://www.w3.org/1999/xhtml"
    xmlns="http://www.w3.org/1999/xhtml"
    xmlns:xml="http://www.w3.org/XML/1998/namespace"
    elementFormDefault="qualified">
...the rest of the schema has been omitted for brevity...
</xs:schema>';

Instead, you want to execute a query like this, noting the single quote and semicolon at the very end.

Use Northwind;
CREATE XML SCHEMA COLLECTION NorthwindCollection AS
'<?xml version="1.0" encoding="UTF-8"?>
<xs:schema version="1.0" xml:lang="en"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://www.w3.org/1999/xhtml"
    xmlns="http://www.w3.org/1999/xhtml"
    elementFormDefault="qualified">
...the rest of the schema has been omitted for brevity...
</xs:schema>';

You may get a few schema validation warnings when you execute this query because of the complexity of the XHTML schema, but you can ignore them. Figure 1 shows the new NorthwindCollection schemas added to the Northwind database.

[customXml1.jpg]
Figure 1

Although Figure 1 shows the NorthwindCollection within the Object Explorer, you can also confirm that your schema has been added correctly using SQL, as shown in the example that follows:

Use Northwind;
SELECT XSN.name
FROM    sys.xml_schema_collections XSC 
    JOIN sys.xml_schema_namespaces XSN ON 
        (XSC.xml_collection_id = XSN.xml_collection_id)

The output of the query is something like the following. You can see that the XHTML namespace appears at the end along with the schemas that already existed in the system.

http://www.w3.org/2001/XMLSchema
http://schemas.microsoft.com/sqlserver/2004/sqltypes
http://www.w3.org/XML/1998/namespace
http://www.w3.org/1999/xhtml

Next you should associate the new column with the new schema collection. Using the Management Studio, you create one composite script that automates this process. In this case, however, you can continue to take it step by step so you see what's happening underneath.

Associating a XML Typed Column with a Schema

You can use the Microsoft SLServer Management Studio Express to associate the NorthwindCollection with the new Notes column. Open the Customers table of the Northwind Database and, within its Column collection, right-click and select Modify. Select the Notes column as shown in Figure 2. Within the Notes column's property page, open the XML Type Specification property and select the NorthwindCollection from the Schema Collection dropdown. Also, set the Is XML Document property to Yes.

At this point, you can save your table and a change script is generated and executed. If you want to see and save the change script after making a modification but before saving the changes, right-click in the grid and select Generate Change Script. Click the Save toolbar button or Ctrl-S to commit your changes to the Customers Table.

[customXml2.jpg]
Figure 2

Now that you've added a new Notes column and associated it with an XHTML schema, you're ready to add some data to an existing row.

Inserting XML Data into an XML Column

You start by adding some data to a row within the Northwind database's Customer table. Add some notes to the famous first row, the customer named Alfreds Futterkiste, specifically CustomerID ALFKI. You add or update data to a column with the XML type just as you add to any datatype. For example, you can try an UPDATE query.

Use Northwind;
UPDATE Customers 
SET Notes = N'<HTML></HTML>' WHERE CustomerID = 'ALFKI';

Upon executing this query, you get this result:

Msg 6913, Level 16, State 1, Line 1
XML Validation: Declaration 
not found for element 'HTML'. Location: /*:HTML[1]

What's this? Oh, yes, you associated a schema with the XML column so the included document must conform, in this case, to the XHTML specification. Now, try again with a valid XHTML document that includes a correct namespace.

Use Northwind;
UPDATE Customers SET Notes = N'
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
 <head>
  <title>Notes about Alfreds</title>
 </head>
 <body>
  <p>He is a nice enough fellow.</p>
 </body>
</html>'
WHERE CustomerID = 'ALFKI';

Execute this corrected query and you see a success message.

(1 row(s) affected)

After you've typed a column as XML and associated an XML Schema, SQL Server 2005 will allow only XML documents that validate. The data can be retrieved from SQL Server using standard System.Data techniques. It can be pulled out of DataReaders or DataSet and manipulated with XmlReaders or as an XmlDocument.

This article is adapted from Professional ASP.NET 2.0 Special Edition (Wrox, 2006, ISBN: 0-7645-7135-4), from Chapter 13, "Working with XML," by Scott Hanselman.

Copyright 2007 by WROX. All rights reserved. Reproduced here by permission of the publisher.



About the Author

Scott Hanselman

Scott Hanselman is a co-author of ASP.NET 2.0 MVP Hacks and Tips (Wrox, 2006, ISBN 0-7645-9766-3) along with fellow Microsoft MVPs (Microsoft Most Valuable Professionals) David Yack, Joe Mayo, Fredrik Normén, Dan Wahlin, J. Ambrose Little, Jonathan Goodyear. Scott is the chief architect at the Corillian Corporation, an eFinance enabler. He has over 13 years experience developing software in C, C++, VB, COM, and certainly VB.NET and C#. Scott is proud to be both a Microsoft RD (regional director) as well as an MVP for both ASP.NET and Solutions architecture.

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: October 29, 2014 It's well understood how critical version control is for code. However, its importance to DevOps isn't always recognized. The 2014 DevOps Survey of Practice shows that one of the key predictors of DevOps success is putting all production environment artifacts into version control. In this webcast, Gene Kim discusses these survey findings and shares woeful tales of artifact management gone wrong! Gene also shares examples of how high-performing DevOps …

  • On-demand Event Event Date: December 18, 2014 The Internet of Things (IoT) incorporates physical devices into business processes using predictive analytics. While it relies heavily on existing Internet technologies, it differs by including physical devices, specialized protocols, physical analytics, and a unique partner network. To capture the real business value of IoT, the industry must move beyond customized projects to general patterns and platforms. Check out this webcast and join industry experts as …

Most Popular Programming Stories

More for Developers

RSS Feeds