Understanding XSD Schema

By Ramesh Balaji

Those who deal with data transfer or document exchange within or across organizations with heterogeneous platforms will certainly accept and appreciate the need and power of XML. I am not going to delve into the merits of XML. I will, however, address a simple but powerful schema concept called XSD or XML Schema Definition.

  • What is XSD Schema?
  • What are the advantages of XSD Schema?
  • What is important in XSD Schema?

What Is a Schema?

A schema is a "Structure", and the actual document or data that is represented through the schema is called "Document Instance". Those who are familiar with relational databases can map a schema to a Table Structure and a Document Instance to a record in a Table. And those who are familiar with object-oriented technology can map a schema to a Class Definition and map a Document Instance to an Object Instance.

A structure of an XML document can be defined as follows:

  • Document Type Definition (DTDs)
  • XML Schema Definition (XSD)
  • XML Data Reduced (XDR) -proprietary to Microsoft Technology

We are specifically going to work with XML Schema Definitions (XSD).

What Is XSD?

XSD provides the syntax and defines a way in which elements and attributes can be represented in a XML document. It also advocates that the given XML document should be of a specific format and specific data type.

XSD is fully recommended by W3C consortium as a standard for defining an XML Document. To know more about latest information on XSD, please refer the W3C site (www.w3.org).

Advantages of XSD

So what is the benefit of this XSD Schema?

  • XSD Schema is an XML document so there is no real need to learn any new syntax, unlike DTDs.

  • XSD Schema supports Inheritance, where one schema can inherit from another schema. This is a great feature because it provides the opportunity for re-usability.

  • XSD schema provides the ability to define own data type from the existing data type.

  • XSD schema provides the ability to specify data types for both elements and attributes.

Case Study

ABC Corp. a fictitious software consultancy firm, which employs around 25 people, has been requested by its payroll company to submit employee information, which includes the Full Time and Part Time consultants, electronically in an XML format to expedite payroll processing.

The Payroll Company told ABC Corp. the following information will be needed for the Full Time and Part Time Employees.

Employee Information

SSN
Name
DateOfBirth
EmployeeType
Salary

Here is the actual XML document for the above information.

<?xml version="1.0" ?> - <Employees xmlns="http://www.abccorp.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.abccorp.com/employee.xsd"> - <Employee> <SSN>737333333</SSN> <Name>ED HARRIS</Name> <DateOfBirth>1960-01-01</DateOfBirth> <EmployeeType>FULLTIME</EmployeeType> <Salary>4000</Salary> </Employee> </Employees>

Here is the XML Schema for the above Employee Information

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Employee" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="SSN="xsd:string> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="DateOfBirth" type="xsd:date"/> <xsd:element name="EmployeeType" type="xsd:string"/> <xsd:element name="Salary" type="xsd:long"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>

Let's examine each line to understand the XSD Schema.

Schema Declaration

For an XSD Schema, the root element is <schema>. The XSD namespace declaration is provided with the <schema >to tell the XML parser that it is an XSD Schema.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

The namespace that references an XSD Schema is the W3C recommended version of XSD. The "xsd:" prefix is used to make sure we are dealing with XSD Schema, but any prefix can be given.

Element Declaration

"Element" is the important part of the schema because it specifies the kind of information. The following element declaration in our example is going to deal with Employee information.

<xsd:element name="Employee" minOccurs="0" maxOccurs="unbounded">

An "element" declaration in XSD should have the following attributes.

name: This attribute specifies the name of an element. "Employee"in our example.

type: This attribute refers to Simple Type or Complex Type, which will be explained very soon in this article.

minoccurs: This attribute will specify how many elements at a Minimum will be allowed. The default is '0", which means it is an optional element.

Assume that minoccurs attribute carries a value of "1". This would mean the "Employee" element should be specified at least once in the XML document.

maxoccurs: This attribute will specify how many elements at a Maximum will be allowed in an XML document. Assume that maxoccurs attribute carries a value of "2". This would mean the "Employee" element should NOT be specified more than twice.

To summarize, let's say the minoccurs is "1" and maxoccurs is "2" for the "Employee" element. This means there should be atleast one instance of the "Employee" element in the XML document, but the total number of instances of "Employee" element shouldn't exceed two.

If you tried passing three instances of "Employee" element in the XML document, the XML parser will throw an error.

To allow the "Employee" element to be specified an unlimited number of times in an XML document, specify the "unbounded" value in the maxoccurs attribute.

The following example states that the "Employee" element can occur an unlimited number of times in an XML document.

<xsd:element name="Employee" minOccurs="0" maxOccurs="unbounded">

Complex Type

An XSD Schema element can be of the following types:

  • Simple Type
  • Complex Type

In an XSD Schema, if an element contains one or more child elements or if an element contains attributes, then the element type should be "Complex Type"

<xsd:complexType> <xsd:sequence> <xsd:element name="SSN="xsd:string> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="DateOfBirth" type="xsd:date"/> <xsd:element name="EmployeeType" type="xsd:string"/> <xsd:element name="Salary" type="xsd:int"/> </xsd:sequence> </xsd:complexType>

The Employee element has SSN, Name, Date of Birth, Salary and Employee type, which are specified as child elements. As a result Employee Element must be defined as a Complex Type because there are one or more elements under Employee element.

xsd:sequence

The <xsd:sequence> specifies the order in which the elements need to appear in the XML document. This means the elements SSN, Name, DateOfBirth, EmployeeType and Salary should appear in the same order in the XML document. If the order is changed, then the XML parser will throw an error.

Simple Type

In an XSD Schema an element should be referred to as a Simple Typewhen you create a User Defined type from the given base data type.

Before going further into Simple Types, I would like to mention that XSD provides a wide variety of base data types that can be used in a schema. A complete description of the data types is beyond the scope of this article.

I would suggest reading at the following Web sites to learn more about data types.

  • http://www.w3.org
  • http://msdn.microsoft.com search for XSDs.

Some of the base data types, which we used in the "Employee" element examples, are:

  • xsd:string
  • xsd:int
  • xsd:date

Knowing that Simple Type Elements can specify User-defined data types in XML Schema, the real question is how do we know where to use a specific Simple Type?

Let's take a look at the schema again.

<xsd:complexType> <xsd:sequence> <xsd:element name="SSN="xsd:string> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="DateOfBirth" type="xsd:date"/> <xsd:element name="EmployeeType" type="xsd:string"/> <xsd:element name="Salary" type="xsd:int"/> </xsd:sequence> </xsd:complexType>

Assume the Payroll processing company wants the social security number of the employees formatted as "123-11-1233".

For this we will create a new data type called "ssnumber".

The following is the code to accomplish the above requirement.

<xsd:simpleType name="ssnumber"> <xsd:restriction base="xsd:string"> <xsd:length value="11"> <xsd:pattern value="\d{3}\-\d{2}\-\d{4}"/> </xsd:restriction> </xsd:simpleType>

To start with, we should provide the name of the Simple Type, which is "ssnumber".

The restriction base specifies what is the base data type in which we derive the User Defined Data type, which is the "string" data type in the above example.

To restrict the social security number to 11 characters, we require the length value to be "11".

To ensure the social security number appears in the "123-11-1233" format, the pattern is specified in the following format.

<xsd:pattern value="\d{3}\-\d{2}\-\d{4}"/>

To explain the pattern,

\d{3} specifies that there should be three characters in the start. Followed by two characters after the first "-" and finally followed by four characters after the second "-".

Incorporating Simple Types into Schema

Now that we know what Simple Type means, let us learn how to effectively incorporate Simple Type into an XSD Schema.

First of all, Simple Types can be global or local.

Let's first look at global usage of Simple Type Element "ssnumber".

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Employee" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="SSN= type ="xsd:ssnumber> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="DateOfBirth" type="xsd:date"/> <xsd:element name="EmployeeType" type="xsd:string"/> <xsd:element name="Salary" type="xsd:int"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:simpleType name="ssnumber"> <xsd:restriction base="xsd:string"> <xsd:length value="11"> <xsd:pattern value="\d{3}\-\d{2}\-\d{4}"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>

In the above example, the child element name "SSN" of parent element "Employee" is defined as user defined data type "ssnumber".

The Simple Type element "ssnumber" is declared outside the Employee element, which means if we have to define another element, say for ex. "Employer" inside the <xsd:schema>, then the "Employer" element can still make use of the "ssnumber" data type if it's needed.

Let's examine a different case where the Simple Type element "ssnumber" will be specific to Employee, and it is not going to be needed elsewhere at all. Then the following schema structure accomplishes this.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Employee" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="SSN="xsd:string> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:length value="11"> <xsd:pattern value="\d{3}\-\d{2}\-\d{4}"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="DateOfBirth" type="xsd:date"/> <xsd:element name="EmployeeType" type="xsd:string"/> <xsd:element name="Salary" type="xsd:int"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>

In the above example, all restrictions such as max length, pattern and base data type are declared inline, and it immediately follows the SSN element.

Understanding Schema Flexibility

So far we have seen how to create a schema by:

a. Declaring and Using Complex Element Type
b. Declaring and Using Simple Element Type
c. Understanding Global scope and Local scope of a given element

I will now explain the flexibility XSD Schema can provide by extending our schema example.

Let's validate further by adding a "FullTime" or "PartTime"Employee Type element.

To provide the validation, we should create a Simple Element Type called "emptype".

<xsd:simpleType name="emptype"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="fulltime"/> <xsd:enumeration value="parttime"/> </xsd:restriction> </xsd:simpleType>

The above schema will successfully create a Simple Type element with base type as "string" data type. The enumeration values are specified in enumeration attributes, which will basically restrict within two values. The enumeration attributes in an XSD Schema provides the ability to define an element in such a way that it should fall between the values given in the enumeration list.

Let us incorporate emptypeSimple Type element in our main Employee schema.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Employee" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="SSN="xsd:ssnumber> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="DateOfBirth" type="xsd:date"/> <xsd:element name="EmployeeType" type="xsd:emptype"/> <xsd:element name="Salary" type="xsd:int"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:simpleType name="ssnumber"> <xsd:restriction base="xsd:string"> <xsd:length value="11"> <xsd:pattern value="\d{3}\-\d{2}\-\d{4}"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="emptype"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="fulltime"/> <xsd:enumeration value="parttime"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>

In the above example, the child element name "EmployeeType" of parent element "Employee" is defined as user defined data type "emptype".

The Simple Type element "emptype" is declared outside the Employee element which is global to elements that fall under <xsd:schema>

So we have defined two simple element types: one is emptype, which basically restricts the value within two enumerated values "fulltime" or "parttime". And another is "ssnumber" which restricts the length of the value to 11 and it should be of "111-11-1111" pattern.

I highlighted the words "enumerated" "length" and "pattern" to emphasize that those words are referred to as Facets.

In XSD Schema, facets provide the flexibility to add restriction for a given base data type. In our examples above, the base data type specified is "string".

Similarly, facets are available for other data types like "int". A few facets available for "int" data type are Enumerated, Minexclusive, Mininclusive, and Pattern.

Let's consider a new requirement. The payroll company wants to process tax information, so they want the employee information which is listed above plus the employee's state and zip code information. All the information should be under the separate header "EmployeeTax".

The above requirement compels us to restructure the schema.

First we will break down the requirement to make it simple.

a. Payroll Company wants all the information listed under "Employee" element.
b. Payroll Company wants state and zip of the given Employee.
c. Payroll company wants (a) and (b) in a separate header "EmployeeTax"

Fortunately XSD Schema supports object-oriented principles like Inheritance hierarchy. This principle comes handy in our requirement.

The following structure can quite easily accomplish the above requirement.

<xsd:complexType name="EmployeeTax" minOccurs="0" maxOccurs="unbounded"> <xsd:complexContent> <xsd:extension base="xsd:Employee"> <xsd:sequence> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zipcode" type="xsd:string"/> </xsd:sequence> </xsd:extension> </xsd:complexContent> </xsd:complexType> </xsd:schema>

In the above code, we started with the name of the Complex Type, which is "EmployeeTax".

The following line is very important. It tells the XML Parser that some portion of EmployeeTaxcontent is getting derived from another Complex Type.

<xsd:complexContent>

To refresh our memory, in simple element type definitions, we used restriction base, which mapped to base data types. In the same way, we need to specify the restriction base for the complex content. The restriction base for complex content should logically be another Complex Type. In our example it is "Employee".

<xsd:extension base="xsd:Employee">

Once the extension base has been defined, all the elements of the "Employee" element will automatically inherit to EmployeeTax Element.

As part of the business requirement, the state and zip code elements are specified which completes the total structure for EmployeeTax Element.

Referencing External Schema

This feature is very useful in situations where one schema has functionality that another schema wants to utilize.

Take a case where the payroll company for ABC Corp. needs some information about the Employers, such as EmployerID and PrimaryContact in a separate XML document.

Assume EmployerID format is the same as Employee SSN format. Since we have already defined the validation for Employee SSN, there exists a valid case for using the Employee schema.

The first step in using different schema is to "include" the schema.

To include the schema, make sure the target namespace is the same as your current working location.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.abccorp.com"> <xsd:include schemaLocation="employee.xsd"/> <xsd:element name="Employer" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element ref ="ssnumber"/> <xsd:element name="PrimaryContact" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>

Please note the include statement, which references the schema location. Make sure the employee.xsd file exists in the same target namespace location.

Once included, the "Employer" element references the ssnumber global element in the same manner as it had been declared within the document itself. Then an additional primary contact element, which is specific to "Employer" element, is defined after the ssnumber element reference.

Annotations

Comments have always been considered a good coding convention. XSD Schema provides a commenting feature through the <Annotation>element.

The <Annotation>element has 2 child elements.

  • <documentation>
  • <appInfo>

<documentation>element provides help on the source code in terms of its purpose and functionality.

<appinfo>element provides help to the end users about the application.

The following schema describes the usage of the Annotation element.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Employee" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="SSN="xsd:ssnumber> <xsd:annotation> <xsd:documentation> The SSN number identifies each employee of ABC CORP </xsd:documentation> <xsd:appInfo> Employee Payroll Info </xsd:appInfo> </xsd:annotation> <xsd:element name="Name" type="xsd:string"/> <xsd:element name="DateOfBirth" type="xsd:date"/> <xsd:element name="EmployeeType" type="xsd:emptype"/> <xsd:element name="Salary" type="xsd:int"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:simpleType name="ssnumber"> <xsd:restriction base="xsd:string"> <xsd:length value="11"> <xsd:pattern value="\d{3}\-\d{2}\-\d{4}"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="emptype"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="fulltime"/> <xsd:enumeration value="parttime"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>

Conclusion

I tried to cover a few interesting features of XSD Schema. But there is a whole lot of information about XSD schema in

www.w3c.org

http://msdn.microsoft.com- search for XSDs.

Happy Reading.

About the Author

Ramesh Balaji develops business applications using ASP, VB and SQL Server. He's a frequent contributor to www.4guysfromrolla.com. When not programming, he enjoys spending time with his kids. He can be reached at iambramesh@yahoo.com.



Comments

  • Great work

    Posted by Manjula on 07/28/2014 12:38am

    Simply great work with short sentences

    Reply
  • small correction

    Posted by engineer on 06/12/2014 08:13am

    This article helped but The default value of minoccurs is 1. http://www.w3schools.com/schema/schema_complex_indicators.asp

    Reply
  • Great Knowledge base

    Posted by Shailaja on 03/24/2014 02:58am

    Very Clean & clear information. Thanks a lot.

    Reply
  • java developer

    Posted by Dinesh on 02/17/2014 04:37am

    thanks, helped me to refresh my memory.

    Reply
  • Great Contribution

    Posted by Zeeshan on 12/06/2013 12:59pm

    This is 1 of the best knowledge sharing example I have ever seen. Great work !

    Reply
  • SijID dYb vEHm

    Posted by MXwqokzxLd on 07/20/2013 08:59am

    website link xanax class - withdrawal side effects xanax xr

    Reply
  • Beats usefulness its price,uniform with much more

    Posted by motherdhmm on 06/05/2013 12:25am

    [url=http://www.headphonescheaponlineaustralia.com/]beats by dre[/url] If you are a huge supporter of Dr.Dre Beats headphones. Dre Headphones hype a unique chic look and fashion, and happen with a helpful, alert carrying case. The extensive adjustable headband, soft posh hinged consideration cups and high-quality Mutation telegram makes the Beats listening sense unmatched seeing that comfort and sound. [url=http://www.headphonescheaponlineaustralia.com/products_all.html]Beats headphones shop[/url] Medical consider found that equiangular outlook to music timing, the beat would be the fullness¡¯s understanding waves, kindness value, gastrointestinal motility, neural induction, and so on, have some effect, and accordingly clear the way the association corporeal and mad health. The veiled wring ofmusic far beyond the unitary mind's eye, so listening to music, acknowledgement of music,is jolly average in modern sentience swap, the star is also true. [url=http://www.headphonescheaponlineaustralia.com/dre-beats-solo-hd-c-66_68_77.html]Dre beats solo HD[/url] If that is also not distinctively headset sortilege of the unique sui generis skills, is that when you lend an ear to to music if someone looking on account of you, you do not require to depreciate far-off the dragon headphones, but can coop up down the right ear, the ¡°b¡±symbols, the concert was instantly interrupted, and the bawling reduction responsibility leave pull over working, while the non-speculative of the outside terra longing be enlarged. Unshackle the hand, entire lot returned to normal, you¡¯re aid to your world of music.

    Reply
  • Christian Louboutin from Italy and many Italian brand, was born in the traditional family business Supervised

    Posted by Vetriatszy on 03/14/2013 04:23am

    taste globe towards Abercrombie and thus Fitch Abercrombie hooded sweatshirts appear to be like gorgeous aided by the pad which has been utilized in end production. right into cause-Day recent wikipedia reference linked mode, Abercrombie kinds it's garments this includes Abercrombie polos, Abercrombie hooded sweatshirts, Abercrombie pants, Abercrombie testosterone levels-t-shirts, caps, apparel and various spares. this one produced website businesses is built esspecially plus the purpose of handing relaxation thru Abercrombie hoodies, Abercrombie shorts, Abercrombie w not-t shirts coupled with Abercrombie polo shirts. most polo costume really are intended in these types of an easier way to give maximum comfort and ease, with regard to guys. considering adult lovely extra so, The competent designers generally,often times getting treatment within creating and / or formulating on their clothes. when you have had any one complications in your own individual health or the health of your toddler, it is recommended to consult with a physician otherwise the other physician. desire review the online privacy policy and therefore relation to Use before you employing this site. our standby and call time site usually means settlement to generally be destined simply because relation to Use

    Reply
  • Senior Test Analyst

    Posted by Manna on 03/06/2013 09:56pm

    Thanks a Ton Ramesh, This description helps a lot in clearing most of the doubts in XSD. Keep up the good work.

    Reply
  • awesome explanation

    Posted by sam on 12/08/2012 11:29am

    thanks Ramesh,this was indeed one of the most helpful articles on xsd i found so far...kudos to u for this excellent work..

    Reply
  • Loading, Please Wait ...

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: September 10, 2014 Modern mobile applications connect systems-of-engagement (mobile apps) with systems-of-record (traditional IT) to deliver new and innovative business value. But the lifecycle for development of mobile apps is also new and different. Emerging trends in mobile development call for faster delivery of incremental features, coupled with feedback from the users of the app "in the wild." This loop of continuous delivery and continuous feedback is how the best mobile …

  • On-demand Event Event Date: September 17, 2014 Another day, another end-of-support deadline. You've heard enough about the hazards of not migrating to Windows Server 2008 or 2012. What you may not know is that there's plenty in it for you and your business, like increased automation and performance, time-saving technical features, and a lower total cost of ownership. Check out this webcast and join Rich Holmes, Pomeroy's practice director of virtualization, as he discusses the future state of your servers, …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds