XML Schema
XML Schema
XML Schema
Introduction
An XML schema is a description of a type of XML document.
Expressed in terms of constraints on the structure and content of documents of that
type, above and beyond the basic syntactical constraints imposed by XML itself.
These constraints are generally expressed using some combination of grammatical
rules governing:
1. the order of elements,
2. Boolean predicates that the content must satisfy,
3. data types governing the content of elements and attributes, and
4. more specialized rules such as uniqueness and referential integrity constraints.
Schema
If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE
definition with the following syntax:
<DOCTYPE root-element [element-declarations]>
Example XML document with an internal
DTD:
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tulsi</to>
<from>Giri</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
The DTD is interpreted like this:
!DOCTYPE note defines that the root element of this document is note
!ELEMENT note defines that the note element contains four elements:
"to,from,heading,body"
!ELEMENT to defines the to element to be of type "#PCDATA“
!ELEMENT from defines the from element to be of type "#PCDATA“
!ELEMENT heading defines the heading element to be of type "#PCDATA"
!ELEMENT body defines the body element to be of type "#PCDATA“
"PCDATA" here the meaning of PCDATA(parsed character data) is parse-able text
data.
External DTD
Private external DTDs are identified by the keyword SYSTEM, and are intended
for use by a single author or group of authors.
. Its syntax is:
<!DOCTYPE root-element SYSTEM "DTD location“>
Example
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tulsi</to>
<from>Girii</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
And this is the file "note.dtd" which contains the DTD:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
Public External DTD
Public external DTDs are identified by the keyword PUBLIC and are intended for
broad use.
The Syntax is :
<!DOCTYPE root_element PUBLIC "DTD_name" "DTD_location“>
Defining Elements
Elements are the main building blocks of XML documents.
In a DTD, elements are declared with an ELEMENT declaration with the following syntax.
<!ELEMENT element-name category>
Or
<!ELEMENT element-name (element-content)>
Empty elements are declared with the category keyword EMPTY. Its syntax is:
<!ELEMENT element-name EMPTY>. For example, <!ELEMENT br EMPTY>.
Elements with only parsed character data are declared with #PCDATA inside
parentheses. Its syntax is:
<!ELEMENT element-name (#PCDATA)>.
For example,
<!ELEMENT from (#PCDATA)>.
When children are declared in a sequence separated by commas, the children must appear in the same
sequence in the document
Cont…
Elements with any content are declared with the category keyword ANY, can contain any
combination of parsable data. Its syntax is:
<!ELEMENT element-name ANY>.
Forexample, <!ELEMENT note ANY>.
Elements with one or more children are declared with the name of the children elements
inside parentheses.
Its syntax is <!ELEMENT element-name (child1, child2,…)>.
For example, <!ELEMENT note (to,from,body)>.
We can declare only one occurrence of an element.
Its syntax is: <!ELEMENT elementname (childname)>.
For example, <!ELEMENT note (message)>.
Child element "message" must occur once, and only once inside the "note" element.
Contd…
We can also declare minimum one occurrence of an element.
Its syntax is
<!ELEMENT element name (childname+)>.
For example:
<!ELEMENT note (message+)>.
We can use * in place of + to declare zero or more occurrence of an element.
We can use * in place of + to declare zero or more occurrence of an element.
We can also declare either/or content.
For example
<!ELEMENT note(to,from,header,(message|body))>.
XML Schema
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
...
...
</xs:schema>
Description
The syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/>
, where xxx is the name of the element and yyy is
the data type of the element.
XML Schema has a lot of built-in data types. Some are:
Xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
Default and Fixed Values for Simple
Elements:
Simple elements may have a default value OR a fixed value specified.
A default value is automatically assigned to the element when no other value is
specified .
Example:
<xs:element name="color" type="xs:string" default="red"/>
A fixed value is also automatically assigned to the element, and you cannot
specify another value. I
<xs:element name="color" type="xs:string" fixed="red"/>
XSD Attributes:
When an XML element or attribute has a data type defined, it puts restrictions on
the element's or attribute's content.
If an XML element is of type "xs:date" and contains a string like "Hello World",
the element will not validate
With XML Schemas, you can also add your own restrictions to your XML
elements and attributes. These restrictions are called facets
XSD Restrictions/ Facets:
The following example defines an element called "age" with a restriction. The
value of age cannot be lower than 0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Set of Values
To limit the content of an XML element to a set of acceptable values, we would use the
enumeration constraint.
The example below defines an element called "car" with a restriction.
The only acceptable values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Series of Values
To limit the content of an XML element to define a series of numbers or letters that
can be used, we would use the pattern constraint.
The example below defines an element called "letter" with a restriction. The only
acceptable value is ONE of the LOWERCASE letters from a to z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Series of Values
The next example defines an element called "initials" with a restriction. The only
acceptable value is THREE of the UPPERCASE letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Series of Values
The next example defines an element called "zipcode" with a restriction. The only
acceptable value is FIVE digits in a sequence, and each digit must be in a range
from 0 to 9:
<xs:element name="zipcode">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0-9][0-9][0-9][0-9][0-9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Whitespace Characters
To limit the length of a value in an element, we would use the length, maxLength,
and minLength constraints.
This example defines an element called "password" with a restriction. The value
must be exactly eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restriction on Length
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Complex Element
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
XSD Text only Elements
A complex text-only element can contain text and attributes.
This type contains only simple content (text and attributes), therefore we add a simpleContent element
around the content.
When using simple content, you must define an extension OR a restriction within the simpleContent
element
<xs:element name="somename">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="basetype">
....
....
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
XSD text only elements
Here is an example of an XML element, "shoesize", that contains text-only:
<shoesize country="france">35</shoesize>
The following example declares a complexType, "shoesize". The content is define
d as integer value, and the "shoesize" element also contains an attribute named
"country":
XSD Text only Elements
<xs:element name="shoesize">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="country" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
XSD Mixed Content (that contain other
element and text)
A mixed complex type element can contain attributes, elements, and text.
Consider XML element, "ordernote", that contains both text and other elements:
<ordernnote>
Dear Mr.<name>Amit Basan</name>.
Your gift order for the birthday with order id
<orderid>9999</orderid>
will be shipped on <shipdate>2012-02-13</shipdate>.
</ordernnote>
Contd…
The following schema declares the "ordernote" element:
<xs:element name="ordernote">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:positiveInteger"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
XSD Indicator
XSD indicators are used to control how elements are to be used in documents with
indicators.
There are seven indicators:
1. Order indicators: They contain;
All
Choice
Sequence
2. Occurrence indicators: They include;
maxOccurs
minOccurs
All Indicator
The <all> indicator specifies that the child elements can appear in any order, and
that each child element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
Choice Indicator
The <choice> indicator specifies that either one child element or another can occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear in a
specific order:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Occurrence indicators
Occurrence indicators are used to define how often an element can occur.
Note: For all "Order" and "Group" indicators (any, all, choice, sequence, group
name, and group reference) the default value for maxOccurs and minOccurs is 1.
maxOccurs Indicator
The <maxOccurs> indicator specifies the maximum number of times an element can
occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The example above indicates that the "child_name" element can occur a minimum of one
time and a maximum of ten times in the "person" element
minOccurs Indicator
The <minOccurs> indicator specifies the minimum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The example above indicates that the "child_name" element can occur a minimum of zero times
and a maximum of ten times in the "person" element.