W3C (the World Wide Web Consortium, http://www.w3.org) published the XML 1.0 specification on February 10th, 1998. The XML 1.1 specification was published six years later, on February 4th 2004. In the six years, XML has taken the industry by storm. XML has become the standard for how to describe and exchange data. The current development platforms, .NET and J2EE, support XML natively. All modern enterprise applications, be it a SQL Server or Oracle database, a BizTalk Server, an Office suite, or any of the other thousands of applications support XML to various degrees. You will be pretty hard pressed to find an application that does not support or use XML.
The first article explained the fundamentals and powers of XPath queries. XPath queries allow you to search and navigate your XML documents easily. This article looks at the fundamentals and powers of XSD schemas. The following articles look at XSL Transformations and then how well these three standards are supported by the .NET framework and what the most important namespaces and types are. This series of articles is not intended as a comprehensive description of all the .NET types around XML. The goal is rather to provide a good introduction so you understand the XML capabilities of the .NET framework and can start leveraging them for your current .NET projects.
The Sample XML Document for the Series of Articles
This series of articles takes it as a given that you are familiar with XML itself. The sample XML document used throughout the articles is a list of employees, which must have for each employee the first name, last name, phone number, and e-mail address and can also provide the job title and a Web address.
<JobTitle>Sr. Enterprise Architect</JobTitle>
The Fundamentals of XSD Schemas
It is very easy to create XML documents whether programmatically or manually through an XML editor like XML Spy, Stylus Studio, or Visual Studio .NET 2003. But, very often when processing a XML document, you want to know that it conforms to a certain structure, the structure your application understands. That is where XSD schemas come into play. XSD schemas are the successor of DTDs (Document Type Definition), the difference being that XSD itself uses a XML syntax. XSD schemas allow you to declare the structure of an XML document, which elements and attributes are allowed, is it a mandatory or optional element, can there be more then one instance of an element, and so forth. You then can use the XSD schema to validate the XML document, meaning does the XML document conform to the structure described by the XSD schema. The XML describes the data and the XSD schema describes the structure of the data. Version 1.0 of the XSD schema standard has been released May 2001 and can be found at http://www.w3.org/TR/xmlschema-0/, http://www.w3.org/TR/xmlschema-1/ along with http://www.w3.org/TR/xmlschema-2/. The working draft of XSD 1.1 can be found at http://www.w3.org/TR/2003/WD-xmlschema-11-req-20030121/.
When you create your XSD schema, you do two things. First, you declare an element or attribute. Declaring means you associate an element or attribute name with a set of constraints, for example an element with the name FirstName is of the string type and only one element of that name is allowed. Second, you define new simple or complex types. XSD has a set of standard types such as string, boolean, integer, date, and so forth. The .NET framework maps these XSD data types against its .NET data types. In our sample XML document, the Employee is a complex type. Think in terms of data structures. In your application code, you would define a new structure called Employee and it would contain the elements FirstName, LastName, PhoneNumber, EmailAddress, WebAddress, and JobTitle. In XSD schemas, you do exactly the same. You define a complex type of the name Employee and then declare all the elements this type has plus the constraints for each element; for example, the FirstName element is of the string type. See the XSD below schema for our sample XML document:
<xs:choice minOccurs="1" maxOccurs="unbounded">
<xs:element name="Employee" type="EmployeeType"/>
<xs:element name="FirstName" type="xs:string"
<xs:element name="LastName" type="xs:string"
<xs:element name="PhoneNumber" type="xs:string"
<xs:element name="EmailAddress" type="xs:string"
<xs:element name="WebAddress" type="xs:string"
<xs:element name="JobTitle" type="xs:string"
<xs:attribute name="ID" form="unqualified" type="xs:string"/>
Let's first look at the XSD elements, meaning the XML elements you use in your XSD schema, which you use to declare an element or attribute. W3C provides a XSD schema that describes all the valid XSD element and attribute names. It can be found at http://www.w3.org/2001/XMLSchema.xsd.
|element||Used to declare an element. Can have any of the attributes listed below to describe the element you are declaring.|
|attribute||Used to declare an attribute. Can have any of the attributes listed below to describe the attribute you are declaring, except otherwise specified.|
|name (attribute)||Specifies the name of the XML element or attribute.|
|type (attribute)||Specifies the type of the XML element or attribute. XSD comes with a number of simple data types like string, integer, date, and so on. Each .NET data type can be mapped to a XSD data type. Refer to your MSDN library for a complete list of the XSD types (search for "XML Data Types Reference"; make sure to put it in double quotes so it searches for the whole term, not just the individual words)|
|minOccurs (attribute)||Describes the minimum number of occurrences of the element (not allowed for attributes). A value of zero means that you can omit this element. Any other value means you must have this element that often; for example, one time. This allows you to make elements mandatory.|
|maxOccurs (attribute)||Describes the number maximum number of occurrences of the element (not allowed for attributes). Setting this value to zero un-declares the element, meaning no element of this name is allowed. Setting it to the value "unbounded" means an unlimited number of elements is allowed. Specifying a value means the element is not allowed to be present more often then specified.|
|default (attribute)||Specifies the default value of the element or attribute. This can only be used for simple data types or text only data types. The "default" and "fixed" attributes are mutually exclusive.|
|fixed (attribute)||Specifies the predetermined and unchangeable value of an element or attribute. This can only be used for simple data types or text-only data types. The "default" and "fixed" attributes are mutually exclusive.|
|ref (attribute)||References a global element or attribute declared someplace else in this or any other referenced XSD schema. This allows you to declare another instance of that element or attribute under a complex type without having to repeat all the constraints (meaning the type, name, minOccurs, maxOccurs, and so on). It does not allow to reference another element or attribute when part of another complex type, only global ones.|
|form (attribute)||If set to "unqualified" then this element or attribute is not required to be qualified with a namespace prefix. If set to "qualified" then this element or attribute must be qualified with a namespace prefix. If not specified then the default from the schema element applies (elementFormDefault and attributeFormDefault).|
This is not a complete list, but these are the main XSD elements you use to declare elements or attributes. Refer to the XSD standard for a complete reference. Now, let's look at the XSD elements you use to define new types. You can define simple types and complex types. A simple type takes a base type and applies some restrictions to it.