[Types.gif]
Castor (0.9.5)
Castor is one of the first tools of its type. It provides XML and DB data binding for the Java framework. Before you start, I should point out that I have only investigated the XML Data Binding functionality.
Castor is capable of accepting XSD files, and generates the source code for a Java library that allows serialization to and from XML. The generated code relies on xerces as its XML parser.
The code generated is split; one class is generated to hold information contained in an XML element, and a second class is generated to handle the marshaling to and from XML. This split has the advantage to keeping the memory footprint small, but does cause a large number of classes to be generated!
Castor is attractive to developers because it is Open Source, and seems pretty well thought out. It is, however, far from complete; there are a number of classes and areas of functionality missing. Because of this, it copes well with simple schemas, but is unable to cope with the complexities of many of the real-world standards.
Limitations include:
- No support for extending elements (extension and restriction)
- No support for substitution groups
- No support for namespaces
- Patchy support for primitive data types (name, token, ENTITY, and the like are not supported)
- Patchy support for facets
In Summary: Castor is a good choice when working with simple schemas; it has a strong following and is still being improved. It is, however, still a work in progress, sporting an awkward command line interface. Eventually, it may be a smart choice, but at the moment it is still maturing. It is useful on small schemas that you have direct control over. Try to use this with an externally controlled schema, and you're asking for trouble in the long run, when they add in features that Castor just can't deal with.
JAXB
JAXB is provided by Sun as part of their web development toolkit (I believe this makes it free, but I am unceratin of the exact terms of their license; check Sun's end user license).
JAXB is something of a half-hearted attempt, but is capable of generating Java code for very basic schemas. However, give it any real work and it chokes.
Limitations include:
- No support for extending elements (extension and restriction)
- No support for external elements (any, anyType)
- No support for enumerated elements
- Collections are not strongly typed
- Does not work with Web Logic
In Summary: If the product you select has to be free, use Castor. Otherwise, there is little reason to use JAXB.
The Liquid Technologies solution is one of the most complete products available, with almost complete support for XSD, XDR. and DTD schemas. It also has the advantage of generating code for a number of platforms an languages—C#, Java, VB6, C++ (for Win32, Linux, HP, and Solaris). It also produces a full set of documentation (CHM and HTML) for the generated class library, making development simpler.
This product is ideal for dealing with both industry standard and hand-made schemas. It is the only system currently available that is reliable enough to use on an evolving industry standard because it is the only generator on the market that supports so much of the XSD standard.
Limitations include:
- No validation on restrictions
- No validation on unions
In Summary: This is the most complete system on the market (at the time of writing), making it ideal for user defined and industry standard schemas. The code generated is clean and easy to use, and the generated documentation is a bonus.
This is, however, a commercial product, and as such needs paying for ($495). That said, it should pay for itself within the week!
Xsd.exe (1.0.3705.0.)
Microsoft have taken a very minimal approach to their generator. The classes (C# or VB.Net) that are generated just contain public member variables that map to the XSD's attributes and elements, no accessors, no methods, nothing. The XML serialization is carried out by external libraries that come with the .NET framework. These classes use the attribute data (the stuff in square brackets) that is declared within each of the generated classes.
The generated classes allow many features of the XSD standard to be supported; however, they provide practically no validation. They are quite happy to read in almost any XML, and will try to fit it into the generated objects. This is a significant limitation, but it can be mitigated by first reading the data into a validating XML DOM (validating against the schema), before loading it into the generated object model; this, of course, costs CPU cycles. Furthermore, it is all too easy to populate objects that are themselves not valid against the schema; again, this can be addressed by validating the output XML against the schema by using a validating DOM parser.
Other significant shortfalls in this generator include:
- A lack of support for namespaces
- Substitution groups
- A lack of support for nested groups (sequences containing choices and so on)
- Optional items with default values can't be excluded from the output XML
- All classes are generated into the same file; larger XSDs can become a little unmanageable
- Choices are represented as an untyped item that can be cast to the appropriate type. When the type is ambiguous, a ItemsElementName property is created; the name of this enumeration is ItemsChoiceTypeX, making it difficult to pick the correct one for an element
In summary, the Microsoft offering is an elegant solution to the problem; the decision to avoid validation should make its code generation robust. However, it is still unable to cope with a large number of XSD features that occur in many real-world schemas. This makes it unfit for use on large, complex schemas, especially if you don't have the ability to change the schema yourself. It is however, free and simple to use, making it ideal for small projects in which you get to describe your own schemas.
In Detail
As you have already seen, using the code generated from a data binding tool can greatly reduce the amount and complexity of the code you have to write when dealing with XML. If you have a complex schema and are not an XSD expert, the benefits are clear.
In this next section, you will look at the code produced for a number of XSD constructs. I have chosen C# as the language because it is simpler to read; however, the output for Java, C++, and VB6 all take the same form, and the code required to use the generated code is almost identical (once the syntax of the languages is taken into account).
Elements Examined
- Sequence
- Choice
- Primitive and Complex Types
- Cardinality
- Extension
Sequence
A sequence describes an element and defines that all child elements must appear (if mandatory) and they must appear in the correct order.
Sample XSD
<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<xs:element name="ParentSeq">
<xs:complexType>
<xs:sequence>
<xs:element name="FirstChild" type="Xs:string" />
<Xs:element name="SecondChild" type="Xs:string" />
<Xs:element name="ThirdChild" type="Xs:string" />
</Xs:sequence>
</Xs:complexType>
</Xs:element>
</Xs:schema>
[Sequence.gif]
Code
// create an instance of the class to load the XML file into
SequenceLib.ParentSeq elm = new SequenceLib.ParentSeq();
// Set data into element
elm.ThirdChild = "Some Data 3";
elm.FirstChild = "Some Data 1";
elm.SecondChild = "Some Data 2";
// Let's see what we've got
Trace.WriteLine(elm.ToXml());
Xml Created
<?xml version="1.0"?>
<!--Created by Liquid XML Data Binding Libraries
(www.liquid-technologies.com) for simon-->
<ParentSeq xmlns:Xs="http://www.w3.org/2001/XMLSchema-instance">
<FirstChild>Some Data 1</FirstChild>
<SecondChild>Some Data 2</SecondChild>
<ThirdChild>Some Data 3</ThirdChild>
</ParentSeq>
Notes
- It does not matter the order in which the child elements are set. They will appear correctly in the output XML.
- If the child elements are not in the correct order when an XML file is read in, an exception is raised.
- The element all works in the same way as sequence. The elements are written out in the order they were defined, but when they are read in, they can be in any order.
Choice
A choice describes an element, and defines that only one of the child elements can appear.
Sample XSD
<?xml version="1.0" encoding="UTF-8" ?>
<Xs:schema xmlns:Xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<Xs:element name="ParentSeq">
<Xs:complexType>
<Xs:choice>
<Xs:element name="FirstChild" type="Xs:string" />
<Xs:element name="SecondChild" type="Xs:string" />
<Xs:element name="ThirdChild" type="Xs:string" />
</Xs:choice>
</Xs:complexType>
</Xs:element>
</Xs:schema>
[Choice.gif]
Code—reading from a file
// create an instance of the class to load the XML file into
choiceLib.ParentChoice elm = new choiceLib.ParentChoice();
elm.FromXmlFile("c:\\Choice.xml");
// we can find out child child element is selected by using
// ChoiceSelectedElement
if (elm.ChoiceSelectedElement == "SecondChild")
{
Trace.Write("The second child element was present and has the
value " + elm.SecondChild);
}
// or by looking at the IsValid flags
Debug.Assert(elm.IsValidFirstChild == false);
Debug.Assert(elm.IsValidSecondChild == true);
Debug.Assert(elm.IsValidThirdChild == false);
Note
If more than one child element is selected in the XML, the FromXmlFile will raise an exception..
Primitive and Complex Types
You've now covered how the basic constructs (all/sequence/choice) are represented. However, all the child items used have all been of type string. This section will explore other types, and show how other more complex child elements can be manipulated.
Sample XSD
<?xml version="1.0" encoding="UTF-8" ?>
<Xs:schema xmlns:Xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<Xs:element name="RootElm">
<Xs:complexType>
<Xs:sequence>
<Xs:element name="StringType" type="Xs:string" />
<Xs:element name="intType" type="Xs:int" />
<Xs:element name="ComplexType">
<Xs:complexType>
<Xs:sequence>
<Xs:element name="DateType"
type="Xs:dateTime" />
<Xs:element name="Base64Type"
type="Xs:base64Binary" />
</Xs:sequence>
</Xs:complexType>
</Xs:element>
</Xs:sequence>
</Xs:complexType>
</Xs:element>
</Xs:schema>
[Types.gif]
Classes Created
public class RootElm : LiquidTechnologies.LtXmlLib3.XmlObjectBase
{
public String StringType { get... set...}
public Int32 IntType { get... set...}
public TypesLib.ComplexType ComplexType { get... set...}
}
public class ComplexType :
LiquidTechnologies.LtXmlLib3.XmlObjectBase
{
public LiquidTechnologies.LtXmlLib3.XmlDateTime DateType
{ get... set...}
public LiquidTechnologies.LtXmlLib3.BinaryData Base64Type
{ get... set...}
}
Code—reading from a file
// create an instance of the class to load the XML file into
TypesLib.RootElm elm = new TypesLib.RootElm();
// set data into the element
elm.StringType = "Test String value";
elm.IntType = 5;
// and the child element
elm.ComplexType.DateType.SetDateTime(2004, 4, 26, 10, 41, 35);
elm.ComplexType.Base64Type.SetData("075BCD15",
BinaryData.Encoding.Hex);
// Let's look at the XML we produced.
Trace.WriteLine(elm.ToXml());
XML Produced
<?xml version="1.0"?>
<!--Created by Liquid XML Data Binding Libraries
(www.liquid-technologies.com) for simon-->
<RootElm xmlns:Xs="http://www.w3.org/2001/XMLSchema-instance">
<StringType>Test String value</StringType>
<intType>5</intType>
<ComplexType>
<DateType>2004-04-26T10:41:35</DateType>
<Base64Type>cLXcUQ==</Base64Type>
</ComplexType>
</RootElm>
Note
If the ComplexType held within the RootElm was optional, you would have to create and assign an object to elm.ComplexType before using it (see next item).
Comments
There are no comments yet. Be the first to comment!