Using Open XML Schema with .NET

A key tenet of service-oriented architecture is that applications should communicate in a decoupled fashion, using messaging as a communication pattern. Decoupling the various components of a system using messages rather than relying on strongly-typed objects enables your system components to evolve and scale with much less effort. In theory, small extensions to messages can be propagated throughout a system, without requiring that all components of the system be recompiled.

Visual Studio .NET simplifies the use of XML messaging in your applications. For example, it encourages you to use XML Web services by greatly reducing the number of knobs you’re required to tweak to create or consume a web service. Visual Studio also makes it very easy to map .NET classes to XML schema, as well as serialize objects to and from XML.

In fact, serializing an instance of a class into an XML document using C# requires just a few lines of code. The following function will map any XML-serializable class into a MemoryStream that contains an XML document:

Stream SerializeThingToXmlStream(object thing)
{
   MemoryStream ms          = new MemoryStream();
   XmlSerializer serializer = new XmlSerializer(thing.GetType());
   serializer.Serialize(ms, thing);
   ms.Seek(0, SeekOrigin.Begin);
   return ms;
}

Although similar code can be used to reverse this process and easily reconstitute an object from an XML document, this default usage pattern does not take advantage of the flexibility that’s a core part of XML. One of the advantages of using XML is the ability to define open elements where extension can occur. By defining open elements in your XML documents and XML schemas, you can take advantage of two complementary programming models in your systems:

  • Strongly-typed programming languages, such as C#, Visual Basic, and Eiffel inside individual applications and components
  • Dynamically-typed XML message documents that are structured as needed, and can evolve as necessary

Although strongly typed programming languages are the ideal tools for building reliable components and applications, modern systems have a need to communicate using messages that are flexible and can evolve over time. For this reason, it’s desirable to decouple the system components from their messages. The decoupled architecture often promoted by SOA advocates enables a system to evolve by focusing on communicating with messages rather than tightly coupled objects.

To examine how you can manage the decoupling process, let’s start with an example that doesn’t use open schema elements, resulting in an inflexible relationship between XML documents and associated .NET classes.

Consider a simple XML document that has some minimal information about an entry in an order-tracking system.

<?xml version="1.0"?>
<Shipment >
   <OrderNo>1234567</OrderNo>
   <Location>
      <Addr>One Acme Way</Addr>
      <City>BoogieTown</City>
      <State>CA</State>
   </Location>
</Shipment>

This simple document tracks the order number and some minimal (actually incomplete) address information. An example of a C# class that maps to this XML document is shown below.

[XmlRoot(Namespace=App.TargetNamespace)]
public class Shipment
{
   string   _orderNo = string.Empty;
   Location _location = new Location();
   public string OrderNo
   {
      get { return _orderNo; }
      set { _orderNo = value; }
   }
   public Location Location
   {
      get { return _location; }
      set { _location = value; }
   }
}

[XmlRoot(Namespace=App.TargetNamespace)]
public class Location
{
   string _addr  = string.Empty;
   string _city  = string.Empty;
   string _state = string.Empty;
   public string Addr
   {
      get { return _addr; }
      set { _addr = value; }
   }
   public string City
   {
      get { return _city; }
      set { _city = value; }
   }
   public string State
   {
      get { return _state; }
      set { _state = value; }
   }
}

The Shipment class above is similar to the class that the tools included with Visual Studio will create when asked to created classes that serialize into specific XML documents.

There’s an interesting aspect to the relationship between XML documents and CLR classes. By default, there is no schema validation performed by the .NET Framework, and no schema is even required to exist. In fact, when serializing an XML document into a CLR class (such as my earlier definition of the Shipment class), the .NET Framework will only attempt to make its best effort when mapping data from the document into the CLR object.

This best-effort approach causes some behavior that you should be aware of. Consider an XML fragment that is very similar to my first Shipment XML document, except that the Location node includes an extra element, named Zip, as shown below:

<?xml version="1.0"?>
<Shipment xmlns_xsd=http://www.w3.org/2001/XMLSchema
          >
   <OrderNo>1234567</OrderNo>
   <Location>
      <Addr>One Acme Way</Addr>
      <City>BoogieTown</City>
      <State>CA</State>
      <Zip>92653</Zip>
   </Location>
</Shipment>

A common way to deserialize an XML document is to use a function like this one, which deserializes an XML file at a specified path:

object DeserializeXmlFile(string path, Type type)
{
   using(FileStream fs = new FileStream(path, FileMode.Open))
   {
      XmlSerializer serializer = new XmlSerializer(type);
      return serializer.Deserialize(fs);
   }
}

So what happens when you deserialize an XML document with extra elements? If you come from the land of strongly typed languages, you might be surprised to learn that no errors or warnings are generated at runtime, although the XmlSerializer class will generate events when unexpected elements and attributes are encountered. Code that handles the UnknownElement and UnknownAttribute events is shown in the following code snippet.

object DeserializeXmlFile(string path, Type type)
{
   using(FileStream fs = new FileStream(path, FileMode.Open))
   {
      XmlSerializer serializer = new XmlSerializer(type);
      serializer.UnknownAttribute +=
                 new XmlAttributeEventHandler(UnknownAttribute);
      serializer.UnknownElement +=
                 new XmlElementEventHandler(UnknownElement);
      return serializer.Deserialize(fs);
   }
}

void UnknownAttribute(object sender, XmlAttributeEventArgs e)
{
   Trace.WriteLine("Unknown Attribute: " + e.Attr.OuterXml);
}

void UnknownElement(object sender, XmlElementEventArgs e)
{
   Trace.WriteLine("Unknown Element: " + e.Element.Name);
}

Although this type of code is useful for logging that an unexpected element or attribute was encountered for debugging purposes, in practice it’s difficult to recover this data for further use in a message processing pipeline.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read