XML Schema

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 58

XML Schema

Introduction
 An XML schema is a description of a type of XML document.
 Expressed in terms of constraints on the structure and content of documents of that
type, above and beyond the basic syntactical constraints imposed by XML itself.
 These constraints are generally expressed using some combination of grammatical
rules governing:
1. the order of elements,
2. Boolean predicates that the content must satisfy,
3. data types governing the content of elements and attributes, and
4. more specialized rules such as uniqueness and referential integrity constraints.
Schema

 Technically, a schema is an abstract collection of metadata, consisting of a set of


schema components: chiefly element and attribute declarations and complex and
simple type definitions.
 XML schema Languages:
 DTD
 XML Schema
Document Type Definitions

 DTD is an approach for defining the structure of XML Document.


 It is an XML schema language whose purpose is to define legal building blocks of
an XML document.
 A DTD defines the document structure with a list of legal elements and attributes.
 We use DTD because with a DTD, each of your XML files can carry a description
of its own format.
 With a DTD, independent groups of people can agree to use a standard DTD for
interchanging data.
 Your application can use a standard DTD to verify that the data you receive from
the outside world is valid.
Document Type Declarations

 A Document Type Declaration associates a DTD with an XML document.


 Document Type Declarations appear in the syntactic fragment doctypedecl near
the start of an XML document.
 The declaration establishes that the document is an instance of the type defined by
the referenced DTD.
 DTDs make two sorts of declaration:
 an optional external subset
 an optional internal subset
Document Type Declarations

 If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE
definition with the following syntax:
<DOCTYPE root-element [element-declarations]>
Example XML document with an internal
DTD:
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tulsi</to>
<from>Giri</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note> 
The DTD is interpreted like this:

 !DOCTYPE note defines that the root element of this document is note
 !ELEMENT note defines that the note element contains four elements:
"to,from,heading,body"
 !ELEMENT to defines the to element to be of type "#PCDATA“
 !ELEMENT from defines the from element to be of type "#PCDATA“
 !ELEMENT heading defines the heading element to be of type "#PCDATA"
 !ELEMENT body defines the body element to be of type "#PCDATA“
 "PCDATA" here the meaning of PCDATA(parsed character data) is parse-able text
data.
External DTD

 If the DTD is declared in an external file, it should be wrapped in a DOCTYPE


definition.
 DTD is present in separate file and a reference is placed to its location in the
document.
 External DTD’s are easy to apply to multiple documents.
 External DTDs are of two types: private and public.
.

Private external DTD

 Private external DTDs are identified by the keyword SYSTEM, and are intended
for use by a single author or group of authors.
 . Its syntax is:
<!DOCTYPE root-element SYSTEM "DTD location“>
Example
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tulsi</to>
<from>Girii</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

And this is the file "note.dtd" which contains the DTD:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)> 
Public External DTD

 Public external DTDs are identified by the keyword PUBLIC and are intended for
broad use.
 The Syntax is :
<!DOCTYPE root_element PUBLIC "DTD_name" "DTD_location“>
Defining Elements
 Elements are the main building blocks of XML documents.
 In a DTD, elements are declared with an ELEMENT declaration with the following syntax.
<!ELEMENT element-name category>
Or
<!ELEMENT element-name (element-content)>
 Empty elements are declared with the category keyword EMPTY. Its syntax is:
<!ELEMENT element-name EMPTY>. For example, <!ELEMENT br EMPTY>.
 Elements with only parsed character data are declared with #PCDATA inside
parentheses. Its syntax is: 
 <!ELEMENT element-name (#PCDATA)>.
 For example,
<!ELEMENT from (#PCDATA)>.
When children are declared in a sequence separated by commas, the children must appear in the same
sequence in the document
Cont…
 Elements with any content are declared with the category keyword ANY, can contain any
combination of parsable data. Its syntax is: 
<!ELEMENT element-name ANY>. 
Forexample, <!ELEMENT note ANY>.
 Elements with one or more children are declared with the name of the children elements
inside parentheses. 
Its syntax is <!ELEMENT element-name (child1, child2,…)>.
  For example, <!ELEMENT note (to,from,body)>.
 We can declare only one occurrence of an element.
 Its syntax is: <!ELEMENT elementname (childname)>. 
For example, <!ELEMENT note (message)>. 
 Child element "message" must occur once, and only once inside the "note" element.
Contd…

 We can also declare minimum one occurrence of an element. 
 Its syntax is 
<!ELEMENT element name (childname+)>. 
For example:
<!ELEMENT note (message+)>. 
  We can use * in place of + to declare zero or more occurrence of an element.
  We can use * in place of + to declare zero or more occurrence of an element.
 We can also declare either/or content.
 For example
<!ELEMENT note(to,from,header,(message|body))>.
XML Schema

 XML Schema is a XML schema language which is an alternative to DTD.


 Unlike DTD, XML Schemas has support for data types and namespaces.
 The XML Schema language, also referred to as XML Schema Definition (XSD).
 defines elements that can appear in a document
 defines attributes that can appear in a document
XML Schema

 defines which elements are child elements


 defines the order of child elements
 defines the number of child elements
 defines whether an element is empty or can include text
 defines data types for elements and attributes
 defines default and fixed values for elements and attributes
Why XML Schema better than DTD?

 XML Schemas are extensible to future additions


 XML Schemas are richer and more powerful than DTDs
 XML Schemas are written in XML
 XML Schemas support data types
 XML Schemas support namespaces
The <schema> Element:

 The <schema> element is the root element of every XML Schema.


 Syntax:
<?xml version="1.0"?>
<xs:schema>
…..
</xs:schema>
 The element may contain some attributes.
Example

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
...
...
</xs:schema>
Description

 The code fragment xmlns:xs="http://www.w3.org/2001/XMLSchema" indicates


that the elements and data types used in the schema come from the
"http://www.w3.org/2001/XMLSchema" namespace.
 It also specifies that the elements and data types that come from the
"http://www.w3.org/2001/XMLSchema" namespace should be prefixed with xs: .
 Code fragment targetNamespace="http://www.w3schools.com" indicates that the
elements defined by this schema (note, to, from, heading, body.) come from the
"http://www.w3schools.com" namespace.
 The code fragment elementFormDefault="qualified" indicates that any elements
used by the XML instance document which were declared in this schema must be
namespace qualified.
Referencing a Schema in an XML
Document:
 For example consider the following “note.xml” file. This file has a reference the
“note.xsd” schema.
<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"
xsi:schemaLocation="http://www.w3schools.com note.xsd">
<to>Tulsi</to>
<from>Giri</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
 <?xml version="1.0"?>
 <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
 targetNamespace="http://www.w3schools.com"
 xmlns="http://www.w3schools.com"
 elementFormDefault="qualified">
 <xs:element name="note">
  <xs:complexType>
  <xs:sequence>
 <xs:element name="to" type="xs:string"/>
 <xs:element name="from" type="xs:string"/>
 <xs:element name="heading" type="xs:string"/>
 <xs:element name="body" type="xs:string"/>
  </xs:sequence>
  </xs:complexType>
 </xs:element>
 </xs:schema>
XSD Simple Type

 Consists of simple elements and attributes.


XSD Simple Elements:

 A simple element is an XML element that can contain only text.


 It cannot contain any other elements or attributes.
 The text can be of many different types.
 It can be one of the types included in the XML Schema definition (Boolean,
string, date, etc.), or it can be a custom type that you can define yourself.
 You can also add restrictions (facets) to a data type in order to limit its content, or
you can require the data to match a specific pattern.
Contd..

 The syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/> 
, where xxx is the name of the element and yyy is
the data type of the element. 
 XML Schema has a lot of built-in data types. Some are:
 Xs:string
 xs:decimal
 xs:integer
 xs:boolean
 xs:date
 xs:time
Default and Fixed Values for Simple
Elements:
 Simple elements may have a default value OR a fixed value specified.
 A default value is automatically assigned to the element when no other value is
specified .
 Example:
<xs:element name="color" type="xs:string" default="red"/>
 A fixed value is also automatically assigned to the element, and you cannot
specify another value. I
<xs:element name="color" type="xs:string" fixed="red"/>
XSD Attributes:

 Simply attributes are associated with the complex elements.


 If an element has attributes, it is considered to be of a complex type.
 Simple elements cannot have attributes.
 But the attribute itself is always declared as a simple type.
 All attributes are declared as simple types.
 The syntax for defining an attribute is:
<xs:attribute name="xxx" type="yyy"/> , where xxx is the name of the attribute
and yyy specifies the data type of the attribute.
Default and Fixed Values for Attributes:

 Attributes may have a default value OR a fixed value specified.


 A default value is automatically assigned to the attribute when no other value is
specified.
 In the following example the default value is "EN":
<xs:attribute name="lang" type="xs:string" default="EN"/>
 A fixed value is also automatically assigned to the attribute, and you cannot
specify another value.
 In the following example the fixed value is "EN":
<xs:attribute name="lang" type="xs:string" fixed="EN"/>
Optional and Required Attributes:

 Attributes are optional by default.


 To specify that the attribute is required, use the "use" attribute:
<xs:attribute name="lang" type="xs:string" use="required"/>
Restrictions on Content:

 When an XML element or attribute has a data type defined, it puts restrictions on
the element's or attribute's content.
 If an XML element is of type "xs:date" and contains a string like "Hello World",
the element will not validate
 With XML Schemas, you can also add your own restrictions to your XML
elements and attributes. These restrictions are called facets
XSD Restrictions/ Facets:

 The following example defines an element called "age" with a restriction. The
value of age cannot be lower than 0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Set of Values
 To limit the content of an XML element to a set of acceptable values, we would use the
enumeration constraint.
 The example below defines an element called "car" with a restriction.
 The only acceptable values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Series of Values
 To limit the content of an XML element to define a series of numbers or letters that
can be used, we would use the pattern constraint.
 The example below defines an element called "letter" with a restriction. The only
acceptable value is ONE of the LOWERCASE letters from a to z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Series of Values

 The next example defines an element called "initials" with a restriction. The only
acceptable value is THREE of the UPPERCASE letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a Series of Values

 The next example defines an element called "zipcode" with a restriction. The only
acceptable value is FIVE digits in a sequence, and each digit must be in a range
from 0 to 9:
<xs:element name="zipcode">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0-9][0-9][0-9][0-9][0-9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Whitespace Characters

 To specify how whitespace characters should be handled, we would use the


whiteSpace constraint.
 This example defines an element called "address" with a restriction. The whiteSpace
constraint is set to "preserve", which means that the XML processor WILL NOT
remove any white space characters:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Whitespace Characters

 This example also defines an element called "address" with a restriction.


 The whiteSpace constraint is set to "replace", which means that the XML processor
WILL REPLACE all white space characters (line feeds, tabs, spaces, and carriage
returns) with spaces:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Length:

 To limit the length of a value in an element, we would use the length, maxLength,
and minLength constraints.
 This example defines an element called "password" with a restriction. The value
must be exactly eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restriction on Length

 <xs:element name="password">
 <xs:simpleType>
 <xs:restriction base="xs:string">
 <xs:minLength value="5"/>
 <xs:maxLength value="8"/>
 </xs:restriction>
 </xs:simpleType>
 </xs:element>
Complex Element

 A complex element is an XML element that contains other elements and/or


attributes.
 There are four kinds of complex elements:
 empty elements
 elements that contain only other elements
 elements that contain only text
 elements that contain both other elements and text
Types of XSD Elements

1. XSD Empty Elements


2. XSD Elements only
3. XSD text only elements
4. XSD Mixed Content (that contain other element and text)
XSD Empty Elements

 An empty complex element cannot have contents, only attributes.


 Consider an empty XML element:
<product prodid="1345" />
 The "product" element above has no content at all. To define a type with no
content, we must define a type that allows elements in its content, but we do not
actually declare any elements
XSD Empty Elements
<xs:element name="product">
<xs:complexType>
 <xs:complexContent>
 <xs:restriction base="xs:integer">
 <xs:attribute name="prodid" type="xs:positiveInteger"/>
 </xs:restriction>
 </xs:complexContent>
</xs:complexType>
</xs:element> 
XSD Elements only

 An "elements-only" complex type contains an element that contains only other


elements
 . Consider an XML element "person", that contains only other elements:
<person>
<firstname>Amit</firstname>
<lastname>Shrestha</lastname>
</person>
XSD element only

<xs:element name="person">
<xs:complexType>
 <xs:sequence>
 <xs:element name="firstname" type="xs:string"/>
 <xs:element name="lastname" type="xs:string"/>
 </xs:sequence>
</xs:complexType>
</xs:element>
XSD Text only Elements
 A complex text-only element can contain text and attributes.
 This type contains only simple content (text and attributes), therefore we add a simpleContent element
around the content.
 When using simple content, you must define an extension OR a restriction within the simpleContent
element
<xs:element name="somename">
<xs:complexType>
 <xs:simpleContent>
 <xs:extension base="basetype">
 ....
 ....
 </xs:extension>
 </xs:simpleContent>
</xs:complexType>
</xs:element>
XSD text only elements

 Here is an example of an XML element, "shoesize", that contains text-only:
<shoesize country="france">35</shoesize>
 The following example declares a complexType, "shoesize". The content is define
d as integer value, and the "shoesize" element also contains an attribute named
"country":
XSD Text only Elements

<xs:element name="shoesize">
<xs:complexType>
 <xs:simpleContent>
 <xs:extension base="xs:integer">
 <xs:attribute name="country" type="xs:string" />
 </xs:extension>
 </xs:simpleContent>
</xs:complexType>
</xs:element>
XSD Mixed Content (that contain other
element and text)
 A mixed complex type element can contain attributes, elements, and text. 
 Consider XML element, "ordernote", that contains both text and other elements:
<ordernnote>
Dear Mr.<name>Amit Basan</name>.
Your gift order for the birthday with order id
<orderid>9999</orderid>
will be shipped on <shipdate>2012-02-13</shipdate>.
</ordernnote> 
Contd…

 The following schema declares the "ordernote" element:
<xs:element name="ordernote">
<xs:complexType mixed="true">
 <xs:sequence>
 <xs:element name="name" type="xs:string"/>
 <xs:element name="orderid" type="xs:positiveInteger"/>
 <xs:element name="shipdate" type="xs:date"/>
 </xs:sequence>
</xs:complexType>
</xs:element> 
XSD Indicator
 XSD indicators are used to control how elements are to be used in documents with
indicators.
 There are seven indicators:
1. Order indicators: They contain;
 All
 Choice
 Sequence
2. Occurrence indicators: They include;
 maxOccurs
 minOccurs
All Indicator
 The <all> indicator specifies that the child elements can appear in any order, and
that each child element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
Choice Indicator
 The <choice> indicator specifies that either one child element or another can occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>
Sequence Indicator

 The <sequence> indicator specifies that the child elements must appear in a
specific order:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Occurrence indicators

 Occurrence indicators are used to define how often an element can occur.
 Note: For all "Order" and "Group" indicators (any, all, choice, sequence, group
name, and group reference) the default value for maxOccurs and minOccurs is 1.
maxOccurs Indicator
 The <maxOccurs> indicator specifies the maximum number of times an element can
occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
</xs:sequence>
</xs:complexType>
</xs:element>
 The example above indicates that the "child_name" element can occur a minimum of one
time and a maximum of ten times in the "person" element
minOccurs Indicator
 The <minOccurs> indicator specifies the minimum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
 The example above indicates that the "child_name" element can occur a minimum of zero times
and a maximum of ten times in the "person" element.

You might also like