CSC570 - Chapter 7 - XSD PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

CSC570 – XML

PROGRAMMING

XML Schemas
XML Schemas

• “Schemas” is a general term--DTDs are a form of XML schemas


• According to the dictionary, a schema is “a structured
framework or plan”
• When we say “XML Schemas,” we usually mean the W3C XML
Schema Language
• This is also known as “XML Schema Definition” language,
or XSD
• “XSD” is frequently used because it’s short
• DTDs, XML Schemas, and RELAX NG are all XML schema languages
Why XML Schemas?

• DTDs provide a very weak specification language


• You can’t put any restrictions on text content
• You have very little control over mixed content (text plus elements)
• You have little control over ordering of elements
• DTDs are written in a strange (non-XML) format
• You need separate parsers for DTDs and XML
• The XML Schema Definition language solves these problems
• XSD gives you much more control over structure and content
• XSD is written in XML
Why not XML schemas?

• DTDs have been around longer than XSD


• Therefore they are more widely used
• Also, more tools support them
• XSD is very verbose, even by XML standards
• More advanced XML Schema instructions can be non-intuitive and
confusing
XML Schemas Support Data Types

• One of the greatest strength of XML Schemas is the support for data
types.
• It is easier to describe allowable document content
• It is easier to validate the correctness of data
• It is easier to define data facets (restrictions on data)
• It is easier to define data patterns (data formats)
• It is easier to convert data between different data types
Referring to a schema
• To refer to a DTD in an XML document, the reference goes before the root
element:
• <?xml version="1.0"?>
<!DOCTYPE rootElement SYSTEM "url">
<rootElement> ... </rootElement>
• To refer to an XML Schema in an XML document, the reference goes in the
root element:
• <?xml version="1.0"?>
<rootElement
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
(The XML Schema Instance reference is required)
xsi:noNamespaceSchemaLocation="url.xsd">
(This is where your XML Schema definition can be found)
...
</rootElement>
Example
<?xml version="1.0“ encoding=“UTF-8”?>

<addresses
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation=“test.xsd”>
<addresss>
<name>Joe Tester</name>
<street>Baker Street 5</street>
</address>
</addresses>
The XSD document
• The file extension is .xsd
• The root element is <schema>
• The XSD starts like this:
<?xml version="1.0"?>
<xs:schema
xmlns:xs="http://www.w3.rg/2001/XMLSchema">
Example
<schema>

• The <schema> element may have attributes:


• xmlns:xs="http://www.w3.org/2001/XMLSchema"
• This is necessary to specify where all our XSD tags are
defined
• elementFormDefault="qualified"
• This means that all XML elements must be qualified
“Simple” and “complex” elements

• A “simple” element is one that contains text and


nothing else
• A simple element cannot have attributes
• A simple element cannot contain other elements
• A simple element cannot be empty
• However, the text can be of many different types,
and may have various restrictions applied to it
“Simple” and “complex” elements

• If an element isn’t simple, it’s “complex”


• A complex element may have attributes
• A complex element may be empty, or it may
contain text, other elements, or both text and
other elements
Defining a simple element
• A simple element is defined as
<xs:element name="elementname“ type="type"/>

where:
• name is the name of the element
• the most common values for type are
xs:boolean xs:integer
xs:date xs:string
xs:decimal xs:time
• Other attributes a simple element may have:
• default="default value" if no other value is specified
• fixed="value" no other value may be
specified
Simple element

• Here are some XML elements:


<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>

• And here are the corresponding simple element definitions:


<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
Default and Fixed Values for Simple Elements
• In the following example the default value is "red":
<xs:element name="color" type="xs:string" default="red"/>

• In the following example the fixed value is "red":


<xs:element name="color" type="xs:string" fixed="red"/>
Exercise 1

Write the Schema for the following statement:

a) <name>Lily</name>

b) <price>19.90</price>
Defining an attribute
• If an element has attributes, it is considered to be of a complex type.
• Attributes themselves are always declared as simple types
• An attribute is defined as
<xs:attribute name="name" type="type" />
where:
• name and type are the same as for xs:element
• Other attributes a simple element may have:
• default="default value" if no other value is specified
• fixed="value" no other value may be specified
• use="optional" the attribute is not required (default)
• use="required" the attribute must be present
• Remember that attributes are always simple types
Defining an attribute
• Here is an XML element with an attribute:
<lastname lang="EN">Smith</lastname>

• And here is the corresponding attribute definition:


<xs:attribute name="lang" type="xs:string"/>
Default and Fixed Values for
Attributes
• In the following example the default value is "EN":
<xs:attribute name="lang" type="xs:string" default="EN"/>

• In the following example the fixed value is "EN":


<xs:attribute name="lang" type="xs:string" fixed="EN"/>
Optional and Required Attributes
• To specify that the attribute is required, use the "use" attribute:
<xs:attribute name="lang" type="xs:string" use="required"/>
Exercise 2

Write the Schema for the following statement:

a) <name scientific_name=“Lilium”>Lily</name>

b) <price currency=“RM”>19.90</price>
Restrictions, or “facets”
• The general form for putting a restriction on a text value is:
• <xs:element name="name">
<xs:simpleType>
<xs:restriction base="type">
... the restrictions ...
</xs:restriction>
<xs:simpleType>
</xs:element>
Restrictions, or “facets”
• For example:

<xs:element name=“shoe_size">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value=“3"/>
<xs:maxInclusive value=“10"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on numbers

Restrictions Details
minInclusive number must be ≥ the given value

minExclusive number must be > the given value

maxInclusive number must be ≤ the given value

maxExclusive number must be < the given value

totalDigits number must have exactly value digits

fractionDigits number must have no more than value digits after the
decimal point
Example Restriction on Number

• Restrict on the value.


• The following example defines an element called "age"
with a restriction. The value of age cannot be lower than
0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on strings
• length -- the string must contain exactly value characters
• minLength -- the string must contain at least value characters
• maxLength -- the string must contain no more than value characters
• pattern -- the value is a regular expression that the string must match
• whiteSpace -- not really a “restriction”--tells what to do with whitespace
• value="preserve" Keep all whitespace
• value="replace" Change all whitespace characters to
spaces
• value="collapse" Remove leading and trailing whitespace,
and replace all sequences of whitespace with
a single space
Example Restriction on String

• Restrict on the value using pattern.

<letter>a</letter>

<choice>x</choice>

<car>Audi</car>
Example Restriction on String

• Restrict on the value using pattern.

<letter>aAbB</letter>

<gender>male</gender>

<password>an1si23A</password>
Complex elements
• A complex element contains other elements and/or
attributes.
• There are 4 kinds of complex elements:
• empty elements
• elements that contain only other elements
• elements that contain only text
• elements that contain both other elements and
text
• Note: Each of these elements may contain
attributes as well!
Examples of Complex Elements
A complex XML element, "product", which A complex XML element, "employee", which
is empty: contains only other elements:

<product pid="1345"/> <employee>


<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>

A complex XML element, "food", which A complex XML element, "description",


contains only text: which contains both elements and text:

<food type="dessert">Ice cream <description>It happened


</food> on <date lang="norwegian">03.03.99
</date> .... </description>
Complex elements
• A complex element is defined as
<xs:element name=“elementname">
<xs:complexType>
... information about the complex type...
</xs:complexType>
</xs:element>
• <xs:sequence> indicates that elements must occur in
this order
Complex elements
• Given the following:
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
• The "employee" element can be declared directly by naming the
element, like this:
<xs:element name="employee">
<xs:complexType>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:complexType>
</xs:element
Complex elements - sequence
• We’ve already seen an example of a complex type whose elements must
occur in a specific order:

<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Complex elements - all
• We can also declare complex type without specifying the order.
• xs:all allows elements to appear in any order
<xs:element name="employee">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element
• Despite the name, the members of an xs:all group can occur
once or not at all
• Use minOccurs="n" and maxOccurs="n" to specify how many times
an element may occur (default value is 1)
• In this context, n may only be 0 or 1
Elements Only

• An "elements-only" complex type contains an element that contains only


other elements.
• An XML element, "person", that contains only other elements:
<person>
<firstname>John</firstname>
<lastname>Smith</lastname>
</person>
• You can define the "person" element in a schema, like this:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Elements Only: Example
<student>
<firstname>Nur Naili</firstname>
<lastname>Ahmad</lastname>
<nickname>Naili</nickname>
<marks>95</marks>
</student>
Elements Only: Example
<xs:element name = 'student'>
<xs:complexType>
<xs:sequence>
<xs:element name = "firstname" type = "xs:string"/>
<xs:element name = "lastname" type = "xs:string"/>
<xs:element name = "nickname" type = "xs:string"/>
<xs:element name = "marks" type = "xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Text-Only Elements

• A complex text-only element can contain text and attributes.


• This type contains only simple content (text and attributes), therefore we
add a simpleContent element around the content.
• When using simple content, you must define an extension OR a restriction
within the simpleContent element, like this:

<xs:element name="somename">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="basetype">
....
....
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Text-Only Elements

• OR
<xs:element name="somename">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="basetype">
....
....
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>

• Tip: Use the extension/restriction element to expand or to limit the


base simple type for the element.
Text-Only Elements

• Here is an example of an XML element, "shoesize", that contains text-only:

<shoesize country="france">35</shoesize>

• The following example declares a complexType, "shoesize". The content is


defined as an integer value, and the "shoesize" element also contains an
attribute named "country":

<xs:element name="shoesize">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="country" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Text-Only Elements
<marks grade = "A" >90</student>

<xs:element name = "marks">


<xs:complexType>
<xs:simpleContent>
<xs:extension base = "xs:integer">
<xs:attribute name = "grade" type = "xs:string" />
</xs:extension>
</ xs:simpleContent >
</xs:complexType>
</xs:element>
Empty elements

• Empty elements are complex


• To define a type with no content, we must define a type that allows
elements in its content, but we do not actually declare any elements,
like this:
<xs:element name="product">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="xs:integer">
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
Empty elements

• However, it is possible to declare the "product" element more


compactly, like this:
<xs:element name="product">
<xs:complexType>
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
</xs:element>
Empty elements: Example
<student id = “101" />

<xs:element name = "student">


<xs:complexType>
<xs:attribute name = “id" type = "xs:positiveInteger"/>
</xs:complexType>
</xs:element>
Empty elements: Example

<xs:element name = "student">


<xs:complexType>
<xs:complexContent>
<xs:restriction base = "xs:integer">
<xs:attribute name = “id" type = "xs:positiveInteger"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
Mixed elements
• A mixed complex type element can contain attributes,
elements, and text.
• We add mixed="true" to the xs:complexType element
• The text itself is not mentioned in the element, and may go
anywhere (it is basically ignored)
Mixed elements
• An XML element, "letter", that contains both text and other elements:
<letter>
Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>
• The following schema declares the "letter" element:
<xs:element name="letter">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:positiveInteger"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Mixed elements: Example
<student id = “101">
Dear <firstname>Nur Naili</firstname>
<lastname>Ahmad</lastname>
<nickname>Naili</nickname>
<marks>95</marks>
</student>
Mixed elements: Example

<xs:element name = 'student'>


<xs:complexType mixed = "true">
<xs:sequence>
<xs:element name = "firstname" type = "xs:string"/>
<xs:element name = "lastname" type = "xs:string"/>
<xs:element name = "nickname" type = "xs:string"/>
<xs:element name = "marks" type = "xs:string"/>
</xs:sequence>
<xs:attribute name = ‘id' type = 'xs:positiveInteger'/>
</xs:complexType>
</xs:element>
Global and local element
• Elements declared at the “top level” of a <schema> are available for use
throughout the schema
• The order of global declarations of a <schema> do not specify the order in the
XML data document
• Elements declared within a xs:complexType are local to that type
• Thus, in

<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstName" type="xs:string" />
<xs:element name="lastName" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>

the elements firstName and lastName are only locally declared


Group

• Element Group
<xs:group name="groupname">
...
</xs:group>

• Element group must be declared first before other element


use it.
Group element example
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
<xs:element name="person" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:group ref="persongroup"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
Group Attribute

• Attribute Group
<xs:attributeGroup name="groupname">
...
</xs:attributeGroup>

• Similar to element group, attribute group must be declared first to


enable it to be use or referred by other element.
Group Attribute Example

e.g. <person firstname=“” lastname=“” birthday=“”/>

<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
<xs:element name="person">
<xs:complexType>
<xs:attributeGroup ref="personattrgroup"/>
</xs:complexType>
</xs:element>
Predefined string types
• Recall that a simple element is defined as:
<xs:element name="name" type="type" />

• Here are a few of the possible string types:


• xs:string -- a string
• xs:normalizedString -- a string that doesn’t contain tabs,
newlines, or carriage returns
• xs:token -- a string that doesn’t contain any whitespace
other than single spaces

• Allowable restrictions on strings:


• enumeration, length, maxLength, minLength, pattern,
whiteSpace
Predefined date and time types

• xs:date -- A date in the format YYYY-MM-DD, for example, 2002-11-


05
• xs:time -- A date in the format hh:mm:ss (hours, minutes, seconds)
• xs:dateTime -- Format is YYYY-MM-DDThh:mm:ss
• Allowable restrictions on dates and times:
• enumeration, minInclusive, maxExclusive, maxInclusive,
maxExclusive, pattern, whiteSpace
Predefined numeric types
• Here are some of the predefined numeric types:
xs:decimal xs:positiveInteger
xs:byte xs:negativeInteger
xs:short xs:nonPositiveInteger
xs:int xs:nonNegativeInteger
xs:long

• Allowable restrictions on numeric types:


• enumeration, minInclusive, maxExclusive, maxInclusive,
maxExclusive, fractionDigits, totalDigits, pattern, whiteSpace
Q&A

The End

You might also like