Unit Iv: PHP and XML

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 163

UNIT IV : PHP and XML.

An introduction to PHP, Using PHP, Variables,


Program control - Built-in functions,
Connecting to Database , Using Cookies,
Regular Expressions- Basic XML, Document
Type Definition, XML Schema - DOM and
Presenting XML, XML Parsers and Validation,
XSL and XSLT Transformation, News Feed
(RSS and ATOM) (8 Hrs)
• PHP is an acronym for "PHP: Hypertext Preprocessor“
• PHP is a widely-used, open source scripting language
• PHP scripts are executed on the server
• PHP is a server scripting language, and a powerful
tool for making dynamic and interactive Web pages.
• PHP is a widely-used, free, and efficient alternative
to competitors such as Microsoft's ASP.
PHP File
• PHP files can contain text, HTML, CSS, JavaScript,
and PHP code
• PHP files have extension ".php“
PHP Do
• PHP can generate dynamic page content
• PHP can create, open, read, write, delete, and
close files on the server
• PHP can collect form data
• PHP can send and receive cookies
• PHP can add, delete, modify data in your
database
• PHP can be used to control user-access
• A PHP script is executed on the server, and the
plain HTML result is sent back to the browser.
• Basic PHP Syntax
• A PHP script can be placed anywhere in the
document.
• A PHP script starts with <?php and ends with
?>:
• <?php
// PHP code goes here
?>
• PHP statements end with a semicolon (;).
• <!DOCTYPE html>
<html>
<body>

<h1>My first PHP page</h1>

<?php
echo "Hello World!";
?>

</body>
</html>
• Comments in PHP
<?php
// This is a single-line comment

# This is also a single-line comment

/*
This is a multiple-lines comment block
that spans over multiple
lines
*/
?>
• PHP Variables
• Variables are "containers" for storing information.
• In PHP, a variable starts with the $ sign, followed
by the name of the variable.
• Example
<?php
$txt = "Hello world!";
$x = 5;
$y = 10.5;
?>
Unlike other programming languages, PHP has no
command for declaring a variable. It is created the
moment you first assign a value to it.
• Rules for PHP variables:
• A variable starts with the $ sign, followed by
the name of the variable
• A variable name must start with a letter or the
underscore character
• A variable name cannot start with a number
• A variable name can only contain alpha-
numeric characters and underscores (A-z, 0-9,
and _ )
• Variable names are case-sensitive ($age and
$AGE are two different variables)
• Output Variables
• echo statement is often used to output data
to the screen.
<?php
$txt = "W3Schools.com";
echo "I love $txt!";
?>
The following example will produce the same
output as the example above
<?php
$txt = "W3Schools.com";
echo "I love " . $txt . "!";
?>
• The following example will output the sum of two
variables
• <?php
$x = 5;
$y = 4;
echo $x + $y;
?>
• PHP is a Loosely Typed Language
• In the example above, notice that we did not have to
tell PHP which data type the variable is.
• PHP automatically converts the variable to the correct
data type, depending on its value.
• In other languages such as C, C++, and Java, the
programmer must declare the name and type of the
variable before using it.
• In PHP, all keywords (e.g. if, else, while, echo, etc.),
classes, functions, and user-defined functions are NOT
case-sensitive.
• However; all variable names are case-sensitive
• <!DOCTYPE html>
<html>
<body>
<?php
$color = "red";
echo "My car is " . $color . "<br>";
echo "My house is " . $COLOR . "<br>";
echo "My boat is " . $coLOR . "<br>";
?>
</body>
</html>
• PHP Data Types
• PHP supports the following data types:
• String
• Integer
• Float (floating point numbers - also called
double)
• Boolean
• Array
• Object
• NULL
Strings
• Surrounded by single/double quotes

$str=“A Simple String”;


$str2=‘Another String’;
$str3=“This is $str2”;
$str4=$str //copy by value
$str5=&$str; //copy by reference

• Escape sequence-\n,\r,\t,\\,\$,\”
• Concatenation using dot operator
• $str3=$str1.$str2;
• Get The Length of a String
<?php
echo strlen("Hello world!"); // outputs 12
?>
• Count The Number of Words in a String
<?php
echo str_word_count("Hello world!");
// outputs 2
?>
Reverse a String
<?php
echo strrev("Hello world!");
// outputs !dlrow olleH
?>
Search For a Specific Text Within a String
<?php
echo strpos("Hello world!", "world");
// outputs 6
?>
• Replace Text Within a String
<?php
echo str_replace("world", "Dolly", "Hello
world!");
// outputs Hello Dolly!
?>
Decision Making

• if...else statement - use this statement if you


want to execute a set of code when a condition is
true and another if the condition is not true
• elseif statement - is used with the if...else
statement to execute a set of code if one of
several condition are true
• switch statement - is used if you want to select
one of many blocks of code to be executed, use
the Switch statement. The switch statement is
used to avoid long blocks of if..elseif..else code.
The if...else Statement

Syntax
if (condition) {
code to be executed if condition is true;
} else {
code to be executed if condition is false;
}
Ex:
$t = date("H");

if ($t < "20") {


echo "Have a good day!";
} else {
echo "Have a good night!";
}
o/p: Have a good night!
The if...elseif....else Statement
Syntax:
if (condition) {
code to be executed if condition is true;
} elseif (condition) {
code to be executed if condition is true;
} else {
code to be executed if condition is false;
}
Example
<html> <body>
<?php
$d=date("D");
if ($d=="Fri")
echo "Have a nice weekend!";
elseif ($d=="Sun")
echo "Have a nice Sunday!";
else
echo "Have a nice day!";
?>
</body> </html>
<html>
<body>
<?php
$d=date("D");
switch ($d)
{
case "Mon":
echo "Today is Monday";
break;
case "Tue":
echo "Today is Tuesday";
break;
default: echo "which day is this ?";
}
?>
</body>
</html>
Loops
• for - loops through a block of code a specified
number of times.
• while - loops through a block of code if and as
long as a specified condition is true.
• do...while - loops through a block of code
once, and then repeats the loop as long as a
special condition is true.
• foreach - loops through a block of code for
each element in an array.
For loop
<html> <body>
<?php
$a = 0;
$b = 0;
for( $i=0; $i<5; $i++ )
{
$a += 10;
$b += 5;
}
echo ("At the end of the loop a=$a and b=$b" );
?>
</body> </html>
While loop
<html> <body>
<?php
$i = 0;
$num = 50;
while( $i < 10)
{
$num--;
$i++;
}
echo ("Loop stopped at i = $i and num = $num" );
?>
</body> </html>
Dowhile loop
<html> <body>
<?php
$i = 0;
$num = 0;
do {
$i++;
}while( $i < 10 );
echo ("Loop stopped at i = $i" );
?>
</body> </html>
<html>
<body>
<?php
$array = array( 1, 2, 3, 4, 5);
foreach( $array as $value )
{
echo "Value is $value <br />";
}
?>
</body>
</html>
foreach Loop
Syntax

foreach ($array as $value) {


code to be executed;
}
Ex:
<?php
$colors = array("red", "green", "blue", "yellow");

foreach ($colors as $value) {


echo "$value <br>";
}?>
o/p:red
green
blue
yellow
• The mysqli_result class
• Represents the result set obtained from a
query against the database.
• /* Properties */
• int $current_field ;
• int $field_count;
• array $lengths;
• int $num_rows;
• mysqli_result::$current_field — Get current field offset of a result pointer
• mysqli_result::data_seek — Adjusts the result pointer to an arbitrary row
in the result
• mysqli_result::fetch_all — Fetches all result rows as an associative array, a
numeric array, or both
• mysqli_result::fetch_array — Fetch a result row as an associative, a
numeric array, or both
• mysqli_result::fetch_assoc — Fetch a result row as an associative array
• mysqli_result::fetch_field_direct — Fetch meta-data for a single field
• mysqli_result::fetch_field — Returns the next field in the result set
• mysqli_result::fetch_fields — Returns an array of objects representing the
fields in a result set
• mysqli_result::fetch_object — Returns the current row of a result set as an
object
• mysqli_result::fetch_row — Get a result row as an enumerated array
• mysqli_result::$field_count — Get the number of fields in a result
• mysqli_result::field_seek — Set result pointer to a specified field offset
• mysqli_result::free — Frees the memory associated with a result
• mysqli_result::$lengths — Returns the lengths of the columns of the
current row in the result set
• mysqli_result::$num_rows — Gets the number of rows in a result
• <?php
$con=mysqli_connect("localhost","my_user","my_pass
word","my_db");
// Check connection
if (mysqli_connect_errno())
{
echo "Failed to connect to MySQL: " .
mysqli_connect_error();
}
// Perform queries
mysqli_query($con,"SELECT * FROM Persons");
mysqli_query($con,"INSERT INTO Persons
(FirstName,LastName,Age)
VALUES ('Glenn','Quagmire',33)");
mysqli_close($con);
?>
• PHP Associative Arrays
• Associative arrays are arrays that use named
keys that you assign to them.
• There are two ways to create an associative
array:
• $age = array("Peter"=>"35", "Ben"=>"37",
"Joe"=>"43");
Select Data With MySQLi
Example (MySQLi Object-oriented)
<?php
$servername = "localhost";
$username = "username";
$password = "password";
$dbname = "myDB";

// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);
// Check connection
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
$sql = "SELECT id, firstname, lastname FROM MyGuests";
$result = $conn->query($sql);

if ($result->num_rows > 0) {
// output data of each row
while($row = $result->fetch_assoc()) {
echo "id: " . $row["id"]. " - Name: " . $row["firstname"]. " " . $row["lastname"].
"<br>";
}
• // sql to delete a record
$sql = "DELETE FROM MyGuests WHERE
id=3";

if ($conn->query($sql) === TRUE) {


echo "Record deleted successfully";
} else {
echo "Error deleting record: " . $conn-
>error;
}
• // sql to update a record
$sql = "UPDATE MyGuests SET lastname='Doe'
WHERE id=2";

if ($conn->query($sql) === TRUE) {


echo "Record updated successfully";
} else {
echo "Error updating record: " . $conn-
>error;
}
• COOKIES
• A cookie is often used to identify a user.
• A cookie is a small file that the server embeds
on the user's computer.
• Each time the same computer requests a page
with a browser, it will send the cookie too.
• With PHP, you can both create and retrieve
cookie values.
• Create Cookies With PHP
• A cookie is created with the setcookie()
function.
• Syntax
• setcookie(name, value, expire);
• Only the name parameter is required. All
other parameters are optional.
• PHP Create/Retrieve a Cookie
• <?php
$cookie_name = "user";
$cookie_value = "John Doe";
setcookie($cookie_name, $cookie_value, time() + (86400 * 30),
"/"); // 86400 = 1 day
?>
<html>
<body>
<?php
if(!isset($_COOKIE[$cookie_name])) {
echo "Cookie named '" . $cookie_name . "' is not set!";
} else {
echo "Cookie '" . $cookie_name . "' is set!<br>";
echo "Value is: " . $_COOKIE[$cookie_name];
}
?>
</body>
</html>
• Delete a Cookie
• To delete a cookie, use the setcookie() function with an
expiration date in the past:
• Example
• <?php
// set the expiration date to one hour ago
setcookie("user", "", time() - 3600);
?>
<html>
<body>
<?php
echo "Cookie 'user' is deleted.";
?>
</body>
</html>
• PHP SESSIONS

• A session is a way to store information (in variables) to be used


across multiple pages.
• Unlike a cookie, the information is not stored on the users
computer.

When you work with an application, you open it, do some changes,
and then you close it. This is much like a Session. The computer
knows who you are. It knows when you start the application and
when you end. But on the internet there is one problem: the web
server does not know who you are or what you do, because the
HTTP address doesn't maintain state.
• Session variables solve this problem by storing user information to
be used across multiple pages (e.g. username, favorite color, etc).
By default, session variables last until the user closes the browser.
• So; Session variables hold information about one single user, and
are available to all pages in one application.
• Start a PHP Session
• A session is started with the session_start() function.
• Session variables are set with the PHP global variable: $_SESSION.
• Now, let's create a new page called "demo_session1.php". In this page, we
start a new PHP session and set some session variables:
• Example
• <?php
// Start the session
session_start();
?>
<!DOCTYPE html>
<html>
<body>
<?php
// Set session variables
$_SESSION["favcolor"] = "green";
$_SESSION["favanimal"] = "cat";
echo "Session variables are set.";
?>
</body>
</html>
• Get PHP Session Variable Values
• Next, we create another page called "demo_session2.php". From this page, we will
access the session information we set on the first page ("demo_session1.php").
• Notice that session variables are not passed individually to each new page, instead
they are retrieved from the session we open at the beginning of each page
(session_start()).
• Also notice that all session variable values are stored in the global $_SESSION
variable:
• Example
• <?php
session_start();
?>
<!DOCTYPE html>
<html>
<body>
<?php
// Echo session variables that were set on previous page
echo "Favorite color is " . $_SESSION["favcolor"] . ".<br>";
echo "Favorite animal is " . $_SESSION["favanimal"] . ".";
?>
</body>
</html>
• Destroy a PHP Session
• To remove all global session variables and destroy the session, use
session_unset() and session_destroy():
• Example
• <?php
session_start();
?>
<!DOCTYPE html>
<html>
<body>
<?php
// remove all session variables
session_unset();
// destroy the session
session_destroy();
?>
</body>
</html>
XML
What is XML?
• XML stands for Extensible Markup Language
• XML was designed to store and transport data.
• XML was designed to be both human- and machine-readable.
• XML is now as important for the Web as HTML was to the
foundation of the Web.
• XML is the most common tool for data transmissions between
all sorts of applications.
• XML tags are not predefined. You must define your own tags
• XML is a W3C Recommendation
The Difference Between XML and
HTML
XML HTML

XML was designed to transport HTML was designed to display


and store data, with focus on data, with focus on how data
what data is looks

XML is about carrying HTML is about displaying


information. information.
XML document Syntax Rules
• All XML Elements Must Have a Closing Tag
• XML Tags are Case Sensitive
Eg:<Message>This is incorrect</message>
<message>This is correct</message>
• XML Elements Must be Properly Nested
• XML Documents Must Have a Root Element
• XML Attribute Values Must be Quoted
Eg: <note date="12/11/2007">
Valid or well formed
• An XML document with correct syntax is called
"Well Formed".
• An XML document validated against a DTD is
"Well Formed" and "Valid".
XML document
Xml document are composed of three things:
1. Elements
2. Control information
3. Entities
1. XML Elements
• An XML element is everything from (including)
the element's start tag to (including) the
element's end tag.
An element can contain:
1. other elements
2. text
3. attributes
4. or a mix of all of the above...
Elements
• Elements are the main building blocks of both XML and
HTML documents.
• Examples of HTML elements are "body" and "table".
• Examples of XML elements could be "note" and "message".
• Elements can contain text, other elements, or be empty.
• Examples of empty HTML elements are "hr", "br" and
"img".
Examples:
• <body>some text</body> // html
<message>some text</message> // xml
2. Control Information

1. comments
2. Processing instructions
3. Document type declaration
Comments in XML

• The syntax for writing comments in XML is


similar to that of HTML.
• <!-- This is a comment -->
Processing information
• <?xml version="1.0" encoding="UTF-8"?>

• This instruction in the file follows the rules of


XML version 1.0.
• This instruction is the first instruction in XML
document.
Creating a xml document
• <bookstore>
<book category="CHILDREN">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title>Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
2. Entities
• Entities are used to define shortcuts to special characters.
<!ENTITY entity_name "content">
• Example
• DTD Example:

<!ENTITY writer "Donald Duck.">


<!ENTITY copyright "Copyright SRM.">

XML example:

<author>&writer; &copyright;</author>
• Note: An entity has three parts: an ampersand (&), an entity
name, and a semicolon (;).
3.Document Type Declaration(DTD)
• DTD – Document type definition.
• DTD holds the rules of the grammar for a
particular XML data structure.
• DTD is a way to describe XML language
precisely.
• DTDs check vocabulary and validity of the
structure of XML documents against
grammatical rules of appropriate XML
language.
Document Definitions
• There are different types of document
definitions that can be used with XML:
1. The original Document Type Definition (DTD)
2. XML based-XML Schema
• This note is a note to Tove, from Jani, stored
as XML:
• <note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body>
</note>
Internal DTD
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Will meet this weekend</body>
</note>
External DTD: Sample.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Note.dtd
<!DOCTYPE note
[
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
Xml namespace
• XML Namespaces provide a method to avoid
element name conflicts.
• A namespace is a unique URI (Uniform
Resource Locator)

Name Conflicts
• In XML, element names are defined by the
developer. This often results in a conflict when
trying to mix XML documents from different
XML applications.
Name Conflicts
This XML carries HTML table information:
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
This XML carries information about a table (a piece of
furniture):
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
If these XML fragments were added together, there would be a
name conflict. Both contain a <table> element, but the
elements have different values
Namespaces can be declared in the
a.)elements where they are used
or
b.)in the XML root element:
Solving the Name Conflict Using a
<root>
Prefix
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table></root>
<root
xmlns:h=“http://www.w3.org/TR/html4/”
xmlns:f="http://www.abc.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
XML Schema
• XML Schema is an XML-based alternative to
DTD.
• An XML schema describes the structure of an
XML document.
• The XML Schema language is also referred to
as XML Schema Definition (XSD).
What is an XML Schema?
The purpose of an XML Schema is to define the legal building
blocks of an XML document, just like a DTD.
An XML Schema:
• defines elements that can appear in a document
• defines attributes that can appear in a document
• defines which elements are child elements
• defines the order of child elements
• defines the number of child elements
• defines whether an element is empty or can include text
• defines data types for elements and attributes
• defines default and fixed values for elements and attributes
A Simple XML Document
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Will meet this weekend!</body>
</note>
A DTD File
<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
An XML Schema
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www. w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element></xs:schema>
• xmlns:xs=http://www.w3.org/2001/XMLSchema-indicates that the
elements and data types used in the schema come from the
"http://www.w3.org/2001/XMLSchema" namespace.
• It also specifies that the elements and data types that come from
the "http://www.w3.org/2001/XMLSchema" namespace should
be prefixed with xs:
• targetNamespace=http://www.abc.com-indicates that the
elements defined by this schema (note, to, from, heading, body.)
come from the "http://www.abc.com" namespace.
• xmlns=http://www.w3schools.com-indicates that the default
namespace is "http://www.abc.com".
• elementFormDefault="qualified“-indicates that elements from the
target namespace must be qualified with the namespace prefix
• The note element is a complex type because it contains other
elements.
• The other elements (to, from, heading, body) are simple types
because they do not contain other elements.
Advantage
XML Schemas will be used in most Web applications
as a replacement for DTDs. Here are some
reasons:
• XML Schemas are extensible to future additions
• XML Schemas are richer and more powerful than
DTDs
• XML Schemas are written in XML
• XML Schemas support data types
• XML Schemas support namespaces
XML parser
• XML parser can handle documents in any way
that their developers choose.
• An XML parser converts an XML document
into an XML DOM object - which can then be
manipulated .
• Two models are commonly used for parser
1. SAX
2.DOM
XML DOM
• A DOM (Document Object Model) defines a standard way for
accessing and manipulating documents.
• The XML DOM views an XML document as a tree-structure.
• All elements can be accessed through the DOM tree. Their
content (text and attributes) can be modified or deleted, and
new elements can be created. The elements, their text, and
their attributes are all known as nodes.
• To extract the text from the “to” element in the XML file
above ("note.xml"), the syntax is:

getElementsByTagName("to")[0].childNodes[0].nodeValue
Load an XML Document
<!DOCTYPE html>
<html><body><script>
if (window.XMLHttpRequest) {
xhttp=new XMLHttpRequest(); }
else { / for IE 5/6
xhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xhttp.open("GET","books.xml",false);
xhttp.send();
xmlDoc=xhttp.responseXML;
document.write("XML document loaded into an XML
DOM Object.");</script></body></html>
• Create an XMLHttpRequest object
• Use the open() and send() methods of the
XMLHttpRequest object to send a request to a
server
• Get the response data as XML data
XML DOM Properties
• These are some typical DOM properties:
• x.nodeName - the name of x
• x.nodeValue - the value of x
• x.parentNode - the parent node of x
• x.childNodes - the child nodes of x
• x.attributes - the attributes nodes of x
• Note: In the list above, x is a node object.
XML DOM Methods

• x.getElementsByTagName(name) - get all


elements with a specified tag name
• x.appendChild(node) - insert a child node to x
• x.removeChild(node) - remove a child node
from x
• Note: In the list above, x is a node object.
Example
1. The JavaScript code to get the text from the first <title>
element in note.xml:
2. txt=xmlDoc.getElementsByTagName("to")[0].childNodes[0
].nodeValue
3. After the execution of the statement, txt will hold the
value "Tove"
• Explained:
• xmlDoc - the XML DOM object created by the parser.
• getElementsByTagName("to")[0] - the first <to> element
• childNodes[0] - the first child of the <to> element (the text
node)
• nodeValue - the value of the node (the text itself)
<!DOCTYPE html>
<html>
<head>
<script src="loadxmldoc.js"></script>
</head>
<body>

<script>
xmlDoc=loadXMLDoc("books.xml");
x=xmlDoc.getElementsByTagName("book")[1]
y=x.childNodes[3];
document.write(y.nodeName);
</script>
</body>
</html> O/P: AUTHOR
SAX Parser
• SAX Stands for Simple API for XML Parsing.
• This is an event based XML Parsing and it parse XML file step
by step so much suitable for large XML Files.
• SAX XML Parser fires event when it encountered opening tag,
element or attribute and the parsing works accordingly.
• It’s recommended to use SAX XML parser for parsing large
xml files in Java because it doesn't require to load whole XML
file in Java and it can read a big XML file in small parts.
• Java provides support for SAX parser and you can parse
any xml file in Java using SAX Parser.
• One disadvantage of using SAX Parser in java is
that reading XML file in Java using SAX Parser requires more
code in comparison of DOM Parser.
Difference between DOM and SAX
DOM SAX
DOM parser loads whole xml document in SAX only loads small part of XML file in
memory memory.
DOM parser is faster than SAX because it slow
access whole XML document in memory.
Reading a large XML file using DOM SAX parser in Java is better suitable for
parser there is more chances that it will large XML file than DOM Parser because it
take a long time or even may not be able doesn't require much memory.
to load it completely simply because it
requires lot of memory to create XML
Dom Tree.

DOM parser works on Document Object SAX is an event based xml parser.
Model
Presenting xml
Presentations of XML documents using XSL style
sheets.
Stylesheets are introduced as the presentation
layer in the separation of content, structure and
presentation.
With the help of Stylesheets, XML documents
can be presented specific to the application and
publishing medium, effectively and in varying
forms.
• XSL stands for Extensible Stylesheet Language.
• XSL describes how the XML document should
be displayed.
XSL consists of three parts:
• XSLT - a language for transforming XML
documents
• XPath - a language for navigating in XML
documents
• XSL-FO - a language for formatting XML
documents
• XSLT is used to transform an XML document
into another XML document, or another type
of document that is recognized by a browser,
like HTML and XHTML.
• XSLT transforms an XML source-tree into an
XML result-tree.
Correct Style Sheet FOR XSL Declaration
The root element that declares the document to
be an XSL style sheet is <xsl:stylesheet> or
<xsl:transform>.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Tran
sform">
or
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Tran
sform">
• To get access to the XSLT elements, attributes
and features we must declare the XSLT
namespace at the top of the document.
• The
xmlns:xsl="http://www.w3.org/1999/XSL/Tran
sform" points to the official W3C XSLT
namespace.
• If you use this namespace, you must also
include the attribute version="1.0".
cdcatalog.xml
<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country>USA</country>
<company>Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
.
.
</catalog>
Create an XSL Style Sheet-
cdcatalog.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0“
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Link the XSL Style Sheet to the XML
Document
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="cdcatalog.xsl"?>
<catalog> <cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country>USA</country>
<company>Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
.
.
</catalog>
XSLT <xsl:template> Element
• An XSL style sheet consists of one or more set
of rules that are called templates.
• A template contains rules to apply when a
specified node is matched.
• The match attribute is used to associate a
template with an XML element.
<xsl:template> Element
• The match attribute can also be used to define
a template for the entire XML document.
• The value of the match attribute is an XPath
expression (i.e. match="/" defines the whole
document).
• Eg: <xsl:template match="/">
Example:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl=http://www.w3.org/1999/XSL/Transform>
<xsl:template match="/">
<html> <body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<tr>
<td>…</td>
<td>..</td>
</tr>
</table> </body> </html>
</xsl:template></xsl:stylesheet>
• XSL style sheet is an XML document, it always
begins with the XML declaration: <?xml
version="1.0" encoding="UTF-8"?>.
• The next element, <xsl:stylesheet>, defines
that this document is an XSLT style sheet
document
• The <xsl:template> element defines a
template. The match="/" attribute associates
the template with the root of the XML source
document.
• The content inside the <xsl:template> element
defines some HTML to write to the output.
<xsl:value-of>
• The <xsl:value-of> element can be used to
extract the value of an XML element and add
it to the output stream of the transformation.
Example
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html> <body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<tr>
<td><xsl:value-of select="catalog/cd/title"/></td>
<td><xsl:value-of select="catalog/cd/artist"/></td>
</tr>
</table>
</body> </html>
</xsl:template>
</xsl:stylesheet>
<xsl:for-each>
• The XSL <xsl:for-each> element can be used to
select every XML element of a specified
node-set.
Example:
<xsl:template match="/">
<html> <body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body> </html>
</xsl:template>
<xsl:sort>
• To sort the output, simply add an <xsl:sort>
element inside the <xsl:for-each> element in
the XSL file.
Example:
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<xsl:sort select="artist"/>
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
<xsl:if>
• To put a conditional if test against the content
of the XML file, add an <xsl:if> element to the
XSL document.
Syntax
<xsl:if test="expression">
...some output if the expression is true...
</xsl:if>
Example:
<xsl:for-each select="catalog/cd">
<xsl:if test="price &gt; 10">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
<td><xsl:value-of select="price"/></td>
</tr>
</xsl:if>
</xsl:for-each>
Output the title and artist elements of the CDs that
has a price that is higher than 10.
<xsl:choose>
• The <xsl:choose> element is used in conjunction
with <xsl:when> and <xsl:otherwise> to express
multiple conditional tests.
<xsl:choose>
<xsl:when test="expression">
... some output ...
</xsl:when>
<xsl:otherwise>
... some output ....
</xsl:otherwise>
</xsl:choose>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<xsl:choose>
<xsl:when test="price &gt; 10">
<td bgcolor="#ff00ff">
<xsl:value-of select="artist"/></td>
</xsl:when>
<xsl:otherwise>
<td><xsl:value-of select="artist"/></td>
</xsl:otherwise>
</xsl:choose>
</tr>
</xsl:for-each>
add a pink background-color to the "Artist" column WHEN the
price of the CD is higher than 10.
<xsl:apply-templates>
• The <xsl:apply-templates> element applies a
template to the current element or to the
current element's child nodes.
• If we add a select attribute to the <xsl:apply-
templates> element it will process only the
child element that matches the value of the
attribute.
<xsl:template match="title">
<xsl:template match="/">
<html> Title: <span
<body> style="color:Red">
<h2>My CD Collection</h2> <xsl:value-of
<xsl:apply-templates/> select="."/></span>
</body> <br />
</html> </xsl:template>
</xsl:template>

<xsl:template match="cd"> <xsl:template match="artist">


<p> Artist: <span
<xsl:apply-templates style="color:”Green”>
select="title"/> <xsl:value-of
<xsl:apply-templates select="."/></span>
select="artist"/>
</p> <br />
</xsl:template> </xsl:template>
<xsl:attribute>
• The <xsl:attribute> element is used to add
attributes to elements.
Syntax
<xsl:attribute name="attributename"
namespace="uri">

<!-- Content:template -->

</xsl:attribute>
Example 1
Add a source attribute to the picture element.
<picture>
<xsl:attribute name="source"/>
</picture>
Example 2
Create an attribute-set that can be applied to any output
element.
<xsl:attribute-set name="font">
<xsl:attribute name="fname">Arial</xsl:attribute>
<xsl:attribute name="size">14px</xsl:attribute>
<xsl:attribute name="color">red</xsl:attribute>
</xsl:attribute-set>
<xsl:copy>
• <xsl:copy-of> element creates a copy of the
current node.
• This element can be used to insert multiple
copies of the same node into different places
in the output.
Example 1
Copy the message node to the output document.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>

<xsl:template match="message">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
<xsl:comment>
• A comment is added to the output document.
• <xsl:comment>

<!-- Content:template -->

</xsl:comment>
XML- eXtensible Markup
Language

6/27/2019 XML 120


XML is not…
• A replacement for HTML
(but HTML can be generated from
XML)
• A presentation format
(but XML can be converted into
one)
• A programming language
(but it can be used with almost any
language)
• A network transfer protocol
6/27/2019
(but XML may be transferred over a
XML 121
But then – what is it?
XML is a meta markup
language for text
documents / textual data

XML allows to define


languages
(„applications“) to
represent text documents
6/27/2019
/ textual data
XML 122
XML by Example
<article>
<author>Gerhard Weikum</author>
<title>The Web in 10
Years</title>
</article>
• Easy to understand for human users
• Very expressive (semantics along with
the data)
• Well structured, easy to read and write
from
This programs
looks nice, but…

6/27/2019 XML 123


XML too:
… this is XML, by Example
<t108>
<x87>Gerhard Weikum</x87>
<g10>The Web in 10 Years</g10>
</t108>

• Hard to understand for human users


• Not expressive (no semantics along with
the data)
• Well structured, easy to read and write
from programs

6/27/2019 XML 124


XML
… and what bythisExample
about XML document:
<data>

ch37fhgks73j5mv9d63h5mgfkds8d984
lgnsmcns983
• Impossible to understand for human users
</data>
• Not expressive (no semantics along with
the data)
• Unstructured, read and write only with
Thespecial programs
actual benefit of using XML highly depends on the design of
the application.

6/27/2019 XML 125


Possible Advantages of Using XML
• Truly Portable Data
• Easily readable by human users
• Very expressive (semantics near data)
• Very flexible and customizable (no finite tag set)
• Easy to use from programs (libs available)
• Easy to convert into other representations
(XML transformation languages)
• Many additional standards and tools
• Widely used and supported
6/27/2019 XML 126
App. Scenario 1: Content Mgt.
Clients

XML2HTML XML2WML XML2PDF Converters

Database
with XML
documents
6/27/2019 XML 127
App. Scenario
Buyer
2: Data ExchangeSuppl
ier
XML XML
XML
Adapte Adapte
r (BMECat, ebXML, RosettaNet, r
BizTalk, …)
Legacy
System Legacy
Order
(e.g., System
SAP (e.g.,
R/2) Cobol)

6/27/2019 XML 128


App.<rdf:RDF
Scenario 3: XML for Metadata
<rdf:Description rdf:about="http://www-
dbs/Sch03.pdf">
<dc:title>A Framework for…</dc:title>
<dc:creator>Ralf Schenkel</dc:creator>
<dc:description>While there
are...</dc:description>
<dc:publisher>Saarland
University</dc:publisher>
<dc:subject>XML Indexing</dc:subject>
<dc:rights>Copyright ...</dc:rights>
<dc:type>Electronic Document</dc:type>
<dc:format>text/pdf</dc:format>
<dc:language>en</dc:language>
</rdf:Description>
</rdf:RDF>

6/27/2019 XML 129


App. Scenario 4: Document Markup
<article>
<section id=„1“ title=„Intro“>
This article is about <index>XML</index>.
</section>
<section id=„2“ title=„Main Results“>
<name>Weikum</name> <cite idref=„Weik01“/> shows the
following theorem (see Section <ref idref=„1“/>)
<theorem id=„theo:1“ source=„Weik01“>
For any XML document x, ...
</theorem>
</section>
<literature>
<cite id=„Weik01“><author>Weikum</author></cite>
</literature>
</article>

6/27/2019 XML 130


App. Scenario 4: Document Markup
• Document Markup adds structural and
semantic information to documents, e.g.
– Sections, Subsections, Theorems, …
– Cross References
– Literature Citations
– Index Entries
– Named Entities
• This allows queries like
– Which articles cite Weikum‘s XML paper from
6/27/2019 2001? XML 131
Part 2 – Basic XML Concepts

2.1 XML Standards by the W3C


2.2 XML Documents
2.3 Namespaces

6/27/2019 XML 132


2.1 XML
• XML CoreStandards
Working Group: – an Overview
– XML 1.0 (Feb 1998), 1.1 (candidate for
recommendation)
– XML Namespaces (Jan 1999)
– XML Inclusion (candidate for recommendation)
• XSLT Working Group:
– XSL Transformations 1.0 (Nov 1999), 2.0 planned
– XPath 1.0 (Nov 1999), 2.0 planned
– eXtensible Stylesheet Language XSL(-FO) 1.0 (Oct
2001)
• XML Linking Working Group:
– XLink 1.0 (Jun 2001)
– XPointer 1.0 (March 2003, 3 substandards)
• XQuery 1.0 (Nov 2002) plus many substandards
6/27/2019
• XMLSchema 1.0 (MayXML
2001) 133
2.2 XML Documents
What‘s in an XML document?
• Elements
• Attributes
• plus some other details

6/27/2019 XML 134


A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to
evolve...</abstract>
<section number=“1”
title=“Introduction”>
The <index>Web</index> provides the
universal...
</section>
</text>
</article>

6/27/2019 XML 135


A Simple XML Document
<article> Freely
<author>Gerhard Weikum</author>
definable tags
<title>The Web in Ten Years</title>
<text>
<abstract>In order to
evolve...</abstract>
<section number=“1”
title=“Introduction”>
The <index>Web</index> provides the
universal...
</section>
</text>
</article>

6/27/2019 XML 136


A Simple XML Document
<article> Start
<author>Gerhard Tag
Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to
evolve...</abstract>
<section number=“1”
title=“Introduction”>
The <index>Web</index> provides the
universal...
</section>
</text> Content
</article> of the
End
Tag Eleme Element
nt (Subelem
6/27/2019 XML ents 137
A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to
evolve...</abstract>
<section number=“1”
title=“Introduction”>
The <index>Web</index> provides the
universal...
</section>Attributes
</text> with name
</article> and value

6/27/2019 XML 138


Elements in XML Documents
• (Freely definable) tags: article, title, author
– with start tag: <article> etc.
– and end tag: </article> etc.
• Elements: <article> ... </article>
• Elements have a name (article) and a content (...)
• Elements may be nested.
• Elements may be empty: <this_is_empty/>
• Element content is typically parsed character data (PCDATA),
i.e., strings with special characters, and/or nested elements
(mixed content if both).
• Each XML document has exactly one root element and forms
a tree.
• Elements with a common parent are ordered.
6/27/2019 XML 139
Elements vs. Attributes
Elements may have attributes (in the start tag) that have a name and
a value, e.g. <section number=“1“>.
What is the difference between elements and attributes?
• Only one attribute with a given name per element (but an arbitrary
number of subelements)
• Attributes have no structure, simply strings (while elements can
have subelements)
As a rule of thumb:
• Content into elements
• Metadata into attributes
Example:
<person born=“1912-06-23“ died=“1954-06-07“>
Alan Turing</person> proved that…

6/27/2019 XML 140


XML Documents as Ordered Trees
artic
le

autho titl te
r e xt
number=
abstra secti “1“
Gerha ct on title=“
rd …“
Weiku In order Th inde provide
m … e s…
The x
Web in W
10 eb
years
6/27/2019 XML 141
More on XML Syntax
• Some special characters must be escaped
using entities:
< → &lt;
& → &amp;
(will be converted back when reading the XML
doc)
• Some other characters may be escaped, too:
> → &gt;
“ → &quot;
‘ → &apos;
6/27/2019 XML 142
Well-Formed XML Documents
A well-formed document must adher to, among
others, the following rules:
• Every start tag has a matching end tag.
• Elements may nest, but must not overlap.
• There must be exactly one root element.
• Attribute values must be quoted.
• An element may not have two attributes with
the same name.
• Comments and processing instructions may
not appear inside tags.
6/27/2019 XML 143
Well-Formed XML Documents
A well-formed document must adher to, among
others, the following rules:
• Every start tag has a matching end tag.
Only well-formed
• Elements may nest, but must not overlap.
documents can be
• There must be exactly one root element.
processed
• Attribute values must beby XML
quoted.
• An element may parsers.
not have to attributes with
the same name.
• Comments and processing instructions may
not appear inside tags.
6/27/2019 XML 144
<library> 2.3 Namespaces
<description>Library of the CS
Department</description>
<book bid=“HandMS2000“>
<title>Principles of Data
Mining</title>
<description>
Short introduction to <em>data
mining</em>, useful
for the IRDM course
</description>
</book>
Semantics of the description element is
</library>
ambigous
Content may be defined differently

Renaming may be impossible
Disambiguation of separate (standards!)
XML
applications using unique prefixes
6/27/2019 XML 145
Namespace Syntax
<dbs:book xmlns:dbs=“http://www-dbs/dbs“>

Prefix as Unique URI to


abbrevation of identify the
URI Signal that namespace
namespace
definition
happens

6/27/2019 XML 146


Namespace Example
<dbs:book xmlns:dbs=“http://www-dbs/dbs“>
<dbs:description> ...
</dbs:description>
<dbs:text>
<dbs:formula>
<mathml:math
xmlns:mathml=“http://www.w3.org/1998/Ma
th/MathML“>
...
</mathml:math>
</dbs:formula>
</dbs:text>
</dbs:book>

6/27/2019 XML 147


Default Namespace
• Default namespace may be set for an
element and its content (but not its
attributes):
<book xmlns=“http://www-dbs/dbs“>
<description>...</description>
<book>

• Can be overridden in the elements


by specifying the namespace there
(using prefix or default namespace)

6/27/2019 XML 148


Part 3 – Defining XML Data
Formats
3.1 Document Type Definitions
3.2 XML Schema

6/27/2019 XML 149


3.1 Document Type Definitions
Sometimes XML is too flexible:
• Most Programs can only process a subset of all
possible XML applications
• For exchanging data, the format (i.e., elements,
attributes and their semantics) must be fixed
Document Type Definitions (DTD) for
establishing the vocabulary for one XML
application (in some sense comparable to
schemas in databases)
A document is valid with XML
6/27/2019
respect to a DTD if it 150
DTD Example: Elements
<!ELEMENT article (title,author+,text)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT text
(abstract,section*,literature?)>
<!ELEMENT abstract (#PCDATA)>
<!ELEMENT section (#PCDATA|index)+>
Content of the text
<!ELEMENT literature
Content (#PCDATA)>
of the title element may contain zero
<!ELEMENT index
element is parsed (#PCDATA)>
or more section
character data elementsisina this position
Content of the article element
title element, followed by one or
more author elements,
6/27/2019
followed by a text element
XML 151
Element Declarations in DTDs
One element declaration for each element type:
<!ELEMENT element_name content_specification>
where content_specification can be
• (#PCDATA) parsed character data
• (child) one child element
• (c1,…,cn) a sequence of child elements c1…cn
• (c1|…|cn) one of the elements c1…cn
For each component c, possible counts can be
specified:
– c exactly one such element
– c+ one or more
– c* zero or more
– c? zero or one
Plus arbitrary combinations using parenthesis:
6/27/2019 XML 152
<!ELEMENT f ((a|b)*,c+,(d|e))*>
More on Element Declarations
• Elements with mixed content:
<!ELEMENT text (#PCDATA|index|cite|glossary)*>

• Elements with empty content:


<!ELEMENT image EMPTY>

• Elements with arbitrary content (this is


nothing for production-level DTDs):
<!ELEMENT thesis ANY>

6/27/2019 XML 153


Attribute Declarations in DTDs
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
element
declares two required attributes for element
name
section. attribute
name
attribute
type
attribute
default
6/27/2019 XML 154
Attribute Declarations in DTDs
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
declares two required attributes for element
section.

Possible attribute defaults:


• #REQUIRED is required in each element
instance
• #IMPLIED is optional
• #FIXED default always has this default value
• default
6/27/2019
has thisXMLdefault value if the 155
attribute is

Attribute
CDATA
Types
string data
in DTDs
• (A1|…|An) enumeration of all
possible values of the
attribute (each is XML
name)
• ID unique XML name to
identify the element
• IDREF refers to ID attribute of
some other element
(„intra-document link“)
6/27/2019 • IDREFS list of IDREF, separated by
XML 156
Attribute Examples
<ATTLIST publication type
(journal|inproceedings) #REQUIRED
pubid ID #REQUIRED>
<ATTLIST cite cid IDREF #REQUIRED>
<ATTLIST citation ref IDREF #IMPLIED
cid ID #REQUIRED>

<publications>
<publication type=“journal“ pubid=“Weikum01“>
<author>Gerhard Weikum</author>
<text>In the Web of 2010, XML <cite
cid=„12“/>...</text>
<citation cid=„12“ ref=„XML98“/>
<citation cid=„15“>...</citation>
</publication>
<publication type=“inproceedings“
pubid=“XML98“>
<text>XML, the extended Markup Language,
...</text>
</publication>
6/27/2019 </publications> XML 157
Attribute Examples
<ATTLIST publication type
(journal|inproceedings) #REQUIRED
pubid ID #REQUIRED>
<ATTLIST cite cid IDREF #REQUIRED>
<ATTLIST citation ref IDREF #IMPLIED
cid ID #REQUIRED>

<publications>
<publication type=“journal“ pubid=“Weikum01“>
<author>Gerhard Weikum</author>
<text>In the Web of 2010, XML <cite
cid=„12“/>...</text>
<citation cid=„12“ ref=„XML98“/>
<citation cid=„15“>...</citation>
</publication>
<publication type=“inproceedings“
pubid=“XML98“>
<text>XML, the extended Markup Language,
...</text>
</publication>
6/27/2019 </publications> XML 158
Linking DTD and XML Docs
• Document Type Declaration in the
XML document:
<!DOCTYPE article SYSTEM “http://www-
dbs/article.dtd“>

keywo Root URI for the


rds element DTD

6/27/2019 XML 159


Linking DTD
• Internal DTD:
and XML Docs
<?xml version=“1.0“?>
<!DOCTYPE article [
<!ELEMENT article
(title,author+,text)>
...
<!ELEMENT index (#PCDATA)>
]>
<article>
...
</article>

• Both ways can be mixed, internal


DTD overwrites external entity
6/27/2019
information: XML 160
<!DOCTYPE article SYSTEM „article.dtd“
Flaws of DTDs
• No support for basic data types like integers,
doubles, dates, times, …
• No structured, self-definable data types
• No type derivation
• id/idref links are quite loose (target is not
specified)

 XML Schema
6/27/2019 XML 161
3.2 XML Schema Basics
• XML Schema is an XML application
• Provides simple types (string, integer, dateTime,
duration, language, …)
• Allows defining possible values for elements
• Allows defining types derived from existing types
• Allows defining complex types
• Allows posing constraints on the occurrence of
elements
• Allows forcing uniqueness and foreign keys
6/27/2019 XML 162
Simplified
<xs:schema> XML Schema
<xs:element name=“article“>
Example
<xs:complexType>
<xs:sequence>
<xs:element name=“author“
type=“xs:string“/>
<xs:element name=“title“
type=“xs:string“/>
<xs:element name=“text“>
<xs:complexType>
<xs:sequence>
<xs:element name=“abstract“
type=“xs:string“/>
<xs:element name=“section“
type=“xs:string“
minOccurs=“0“
maxOccurs=“unbounded“/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
6/27/2019 </xs:schema> XML 163

You might also like