Integrative Programming And Technologies
(ITec4121)
Chapter Three
XML and XML Related Technologies
Introduction
Xml (extensible Markup Language) is a markup language and it is
designed to store and transport data.
It was created to provide an easy to use and store self describing data.
(Self-describing data is the data that describes both its content and
structure.)
It is not a replacement for HTML.
It is designed to be self-descriptive and used to carry data, not to
display data.
XML tags are not predefined. You must define your own tags.
XML is platform independent and language independent.
XML truly powerful is its international acceptance.
XML interfaces for databases, programming, office application mobile phones
and more due to its platform independent feature
2
Features of XML
It separates data from HTML
It simplifies data sharing
It simplifies data transport
It simplifies Platform change
It increases data availability
It can be used to create new internet languages
Examples:
WSDL for describing available web services
WAP and WML as markup languages for handheld devices
RSS languages for news feeds
RDF and OWL for describing resources and ontology
SMIL for describing multimedia for the web
3
XML Document
Example 1:
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>John</to>
<from>Ahmed</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The first line is the XML declaration. It defines the XML version (1.0)
and the encoding used (ISO-8859-1 = Latin-1/West European character
set).
The next line describes the root element of the document (like saying:
"this document is a note"):
<note>
4
XML Document…
The next 4 lines describe 4 child elements of the root (to, from,
heading, and body).
<to>John</to>
<from>Ahmed</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
And finally the last line defines the end of the root element.
</note>
Note: XML documents must contain a root element. This element is "the
parent" of all other elements
The elements in an XML document form a document tree.
5
XML Document…
All elements can have sub elements (child elements).
Example 2:
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
The terms parent, child, and sibling are used to describe the
relationships between elements.
Parent elements have children.
Children on the same level are called siblings (brothers or sisters).
6
XML Document…
Example:
<?xml version="1.0"?>
<University>
<student>
<firstname>Abdi</firstname>
<lastname>Kemal</lastname>
<contact>0999044993</contact>
<email>[email protected]</email>
<address>
<city>Ambo</city>
<state>Oromia</state>
<pin>201007</pin>
</address>
</student>
</University>
7
XML Related Technologies
8
XML Related Technologies…
9
XML Related Technologies…
10
XML Attributes
XML elements can have attributes which are used to add the information about the
element.
XML attributes enhance the properties of the elements.
XML attributes must always be quoted. We can use single or double quote.
Example:
<book publisher="Tata McGraw Hill"></book>
Or
<book publisher='Tata McGraw Hill'></book>
Metadata should be stored as attribute and data should be stored as element.
<book>
<book category="computer">
<author> A & B </author>
</book>
11
XML Attributes…
Data can be stored in attributes or in child elements.
Difference between attribute and sub-element:
Attributes are part of markup, while sub elements are part of the basic
document contents.
Example
1st way:
<book publisher="Tata McGraw Hill"> </book>
2nd way:
<book>
<publisher> Tata McGraw Hill </publisher>
</book>
In the first way publisher is used as an attribute and in the second way
publisher is an element.
12
XML Document…
All elements can have text content and attributes (just like in HTML).
13
XML Comments
XML comments are used to make codes more understandable other
developers
Comments add notes or lines for understanding the purpose of an XML
code.
Syntax:
<!-- Write your comment-->
Note: You cannot nest one XML comment inside another.
14
XML Comments…
Example:
<?xml version="1.0" encoding="UTF-8" ?>
<!--Students marks are uploaded by months-->
<students>
<student>
<name>Daba</name>
<marks>70</marks>
</student>
<student>
<name>Almaz</name>
<marks>60</marks>
</student>
</students>
15
XML Comments…
Rules for adding XML comments:
Don't use a comment before an XML declaration.
You can use a comment anywhere in XML document except
within attribute value.
Don't nest a comment inside the other comment.
16
XML Validation
A well-formed XML document is an XML document with correct
syntax.
It is very necessary to know about valid XML document before
knowing XML validation.
XML file can be validated by two ways:
1. against DTD (Document Type Definition)
2. against XSD (XML Schema Definition)
DTD and XSD are used to define XML structure.
Valid XML document:
It must be well formed (satisfy all the basic syntax condition)
It should be behave according to predefined DTD or XML schema
17
XML Validation…
Rules for well formed XML:
It must begin with the XML declaration.
It must have one unique root element.
All start tags of XML documents must match end tags.
XML tags are case sensitive.
All elements must be closed.
All elements must be properly nested.
All attributes values must be quoted.
XML entities must be used for special characters.
18
XML Validation…
DTD (Document Type Definition) defines the legal building
blocks of an XML document. It is used to define document
structure with a list of legal elements and attributes.
XSD (XML Schema Definition) is defined as an XML language
and it uses namespaces to allow for reuses of existing definitions.
It supports a large number of built in data types and definition of
derived data types
Actually DTD and XML schema both are used to form a well
formed XML document. We should avoid errors in XML
documents because they will stop the XML programs.
19
Checking Validation using DTD
Before proceeding with XML DTD, you must check the validation. An
XML document is called "well-formed" if it contains the correct
syntax.
A well-formed and valid XML document is one which has been
validated against DTD.
Example: well-formed and valid XML document.
employee.xml
<?xml version="1.0"?>
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<firstname>Abebe</firstname>
<lastname>Zewdie</lastname>
<email>[email protected]</email>
</employee>
20
Checking Validation using DTD…
In the above example, the DOCTYPE declaration refers to an
external DTD file.
The content of the file is shown in below paragraph.
employee.dtd
<!ELEMENT employee (firstname,lastname,email)>
<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)>
<!ELEMENT email (#PCDATA)>
21
Checking Validation using DTD…
Description of DTD:
<!DOCTYPE employee : defines that the root element of the
document is employee.
<!ELEMENT employee: defines that the employee element contains 3
elements "firstname, lastname and email".
<!ELEMENT firstname: defines that the firstname element is
#PCDATA typed. (parse-able data type).
<!ELEMENT lastname: defines that the lastname element is
#PCDATA typed. (parse-able data type).
<!ELEMENT email: defines that the email element is #PCDATA
typed. (parse-able data type).
22
Checking Validation using DTD…
XML DTD with entity declaration:
A doctype declaration can also define special strings that can be
used in the XML file.
An entity has three parts:
An ampersand (&)
An entity name
A semicolon (;)
Syntax to declare entity:
<!ENTITY entity-name "entity-value">
23
Checking Validation using DTD…
Example:
author.xml
<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE author [
<!ELEMENT author (#PCDATA)>
<!ENTITY jm "John Michael">
]>
<author>& jm;</author>
In the above example, jm is an entity that is used inside the author
element. In such case, it will print the value of jm entity that is "John
Michael".
Note: A single DTD can be used in many XML files
24
XML CSS with DTD
CSS (Cascading Style Sheets) can be used to add style and
display information to an XML document. It can format the
whole XML document.
To link XML files with CSS, you should use the following syntax:
<?xml-stylesheet type="text/css" href="cssemployee.css"?>
25
XML CSS with DTD…
XML CSS Example:
cssemployee.css
employee
{
background-color: pink;
}
firstname,lastname,email
{
font-size:25px;
display:block;
color: blue;
margin-left: 50px;
}
26
XML CSS with DTD…
Create the DTD file:
employee.dtd
<!ELEMENT employee (firstname,lastname,email)>
<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)>
<!ELEMENT email (#PCDATA)>
27
XML CSS with DTD…
Example of XML file using CSS and DTD:
employee.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="cssemployee.css"?>
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<firstname>Abebe</firstname>
<lastname>Zewdie</lastname>
<email>
[email protected]</email>
</employee>
Note: CSS is not generally used to format XML file. W3C
recommends XSLT instead of CSS
28
XML Schema
XML schema is a language which is used for expressing constraint
about XML documents.
Examples of schema languages are Relax- NG and XSD (XML
schema definition).
An XML schema is used to define the structure of an XML
document.
It is like DTD but provides more control on XML structure.
29
Checking Validation with XSD
A well-formed and valid XML document is one which has been validated
against Schema.
Example: Create a schema file:
30
Checking Validation with XSD…
See the xml file using XML schema or XSD file.
employee.xml
<?xml version="1.0"?>
<employee
xmlns="http://www.javatpoint.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.javatpoint.com employee.xsd">
<firstname>Abebe</firstname>
<lastname>Zewdie</lastname>
<email>
[email protected]</email>
</employee>
31
Checking Validation with XSD…
Description of XML Schema:
<xs:element name="employee"> : defines the element name
employee.
<xs:complexType> : defines that the element 'employee' is complex
type.
<xs:sequence> : defines that the complex type is a sequence of
elements.
<xs:element name="firstname" type="xs:string"/> : defines that the
element 'firstname' is of string/text type.
<xs:element name="lastname" type="xs:string"/> : defines that the
element 'lastname' is of string/text type.
<xs:element name="email" type="xs:string"/> : defines that the
element 'email' is of string/text type.
32
DTD vs. XSD
33
CDATA and PCDATA
CDATA (Unparsed Character data):
CDATA contains the text which is not parsed further in an XML
document. Tags inside the CDATA text are not treated as markup and
entities will not be expanded.
Example of CDATA:
<?xml version="1.0"?>
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<![CDATA[
<firstname>Abebe</firstname>
<lastname>Zewdie</lastname>
<email>
[email protected]</email>
]]>
</employee>
34
CDATA and PCDATA…
In the above CDATA example, CDATA is used just after the element
employee to make the data/text unparsed, so it will give the value of
employee:
<firstname>Abebe</firstname><lastname>Zewdie</lastname><email>[email protected]</email>
PCDATA(Parsed Character Data):
PCDATA is the text that will be parsed by a parser. Tags inside the
PCDATA will be treated as markup and entities will be expanded.
Example:
<?xml version="1.0"?>
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<firstname>Abebe</firstname>
<lastname>Zewdie</lastname>
<email>
[email protected]</email>
</employee>
35
CDATA and PCDATA…
In the above example, the employee element contains 3 more
elements 'firstname', 'lastname', and 'email', so it parses further to
get the data/text of firstname, lastname and email to give the
value of employee as:
Abebe Zewdie
[email protected] 36
XML Parsers
An XML parser is a software library or package that provides
interfaces for client applications to work with an XML document.
It is designed to read the XML and create a way for programs to use
XML.
It validates the document and check that the document is well
formatted.
37
Types of XML Parsers
Two types of XML Parsers:
1. SAX
2. XML DOM
38
SAX (Simple API for XML)
A SAX Parser implements SAX API. This API is an event based API and less
intuitive.
Features of SAX Parser:
It does not create any internal structure.
Clients does not know what methods to call, they just overrides the methods
of the API and place his own code inside method.
It is an event based parser; it works like an event handler in Java.
It is simple and memory efficient.
It is very fast and works for huge documents.
It is event-based so its API is less intuitive.
Clients never know the full information because the data is broken into pieces
39
XML DOM
A DOM document is an object which contains all the information of an XML
document.
The DOM Parser implements a DOM API which is very simple to use.
DOM defines a standard way to access and manipulate XML documents.
The Document Object Model (DOM) is a programming API for HTML and
XML documents.
It defines the logical structure of documents and the way a document is
accessed and manipulated.
The Document Object Model can be used with any programming language.
The XML DOM makes a tree-structure view for an XML document.
40
XML DOM…
Features of DOM Parser:
It creates an internal structure in memory which is a DOM document object
and the client applications.
It has a tree based structure
It supports both read and write operations and the API is very simple to use.
It is preferred when random access to widely separated parts of a document is
required.
It is memory inefficient(it consumes more memory because the whole XML
document needs to load into memory).
It is comparatively slower than other parsers.
41
XML DOM…
We can modify or delete their content and also create new elements.
The elements, their content (text and attributes) are all known as nodes.
For example, consider this table, taken from an HTML document:
<TABLE>
<ROWS>
<TR>
<TD>A</TD>
<TD>B</TD>
</TR>
<TR>
<TD>C</TD>
<TD>D</TD>
</TR>
</ROWS>
</TABLE>
42
XML DOM…
The Document Object Model represents this table like this:
43
Example 1: Load XML File
This example parses an XML document (“note.xml”) into an XML
DOM object and extracts information from it with JavaScript.
See the XML file that contains message.
note.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>[email protected]</to>
<from>[email protected]</from>
<body>Hello XML DOM</body>
</note>
44
Example 1: Load XML File…
The HTML file that extracts the data of XML document using DOM:
xmldom.html
<!DOCTYPE html>
<html>
<body>
<h1>Important Note</h1>
<div>
<b>To:</b> <span id="to"></span><br>
<b>From:</b> <span id="from"></span><br>
<b>Message:</b> <span id="message"></span>
</div>
45
Example 1: Load XML File…
<script>
if (window.XMLHttpRequest)
{// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}
else
{// code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.open("GET","note.xml",false);
xmlhttp.send();
xmlDoc=xmlhttp.responseXML;
46
Example 1: Load XML File…
document.getElementById("to").innerHTML=
xmlDoc.getElementsByTagName("to")
[0].childNodes[0].nodeValue;
document.getElementById("from").innerHTML=
xmlDoc.getElementsByTagName("from")
[0].childNodes[0].nodeValue;
document.getElementById("message").innerHTML=
xmlDoc.getElementsByTagName("body")
[0].childNodes[0].nodeValue;
</script>
</body>
</html>
47
XML Database
It is a data persistence software system used for storing the huge
amount of information in XML format.
It provides a secure place to store XML documents.
You can query your stored data by using XQuery, export and
serialize into desired format.
XML databases are usually associated with document-oriented
databases.
48
Types of XML databases
1. XML-enabled database
2. Native XML database (NXD)
XML-enable Database:
It works just like a relational database.
It is like an extension provided for the conversion of XML documents.
It stores data in a table in the form of rows and columns.
Native XML Database:
It stores large amount of data.
Instead of table format, it is based on container format.
You can query data by XPath expressions
It is preferred over XML-enable database because it is highly capable
to store, maintain and query XML documents.
49
Example of XML database
<?xml version="1.0"?>
<contact-info>
<contact1>
<name>Abebe Zewdie</name>
<company>Ambo University</company>
<phone>(0120) 4256464</phone>
</contact1>
<contact2>
<name>John Michael </name>
<company>Ambo University</company>
<phone>09990449935</phone>
</contact2>
</contact-info>
In the above example, a table named contacts is created and holds the contacts
(contact1 and contact2). Each one contains 3 entities name, company and phone.
50
XML Namespaces
It is used to avoid element name conflict in XML document
It is declared using the reserved XML attribute whose name must be started
with "xmlns".
Syntax:
<element xmlns:name = "URL">
Where:
name is a namespace prefix.
URL is a namespace identifier
Example: XML file
<?xml version="1.0" encoding="UTF-8"?>
<cont:contact xmlns:cont="http://sssit.org/contact-us">
<cont:name>Vimal Jaiswal</cont:name>
<cont:company>SSSIT.org</cont:company>
<cont:phone>(0120) 425-6464</cont:phone>
</cont:contact>
51
XML Namespaces…
In the above example:
Namespace Prefix: cont
Namespace Identifier(URL): http://sssit.org/contact-us
Generally this conflict occurs when we try to mix XML documents
from different XML application.
Take an example with two tables:
Table1:
<table>
<tr>
<td>Aries</td>
<td>Bingo</td>
</tr>
</table>
52
XML Namespaces…
Table2: This table carries information about a computer table.
<table>
<name>Computer table</name>
<width>80</width>
<length>120</length>
</table>
If you add these both XML fragments together, there would be a
name conflict because both have <table< element. Although they
have different name and meaning.
53
XML Namespaces…
You can get rid of name conflict:
1. By Using a Prefix
2. By Using xmlns Attribute
By Using a Prefix:
You can easily avoid the XML namespace by using a name prefix.
<h:table>
<h:tr>
<h:td>Aries</h:td>
<h:td>Bingo</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>Computer table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
Note: In this example, you will get no conflict because both the tables have specific
names
54
XML Namespaces…
By Using xmlns Attribute:
You can use xmlns attribute to define namespace with the following syntax:
<element xmlns:name = "URL">
Example:
<root>
<h:table xmlns:h="http://www.abc.com/TR/html4/">
<h:tr>
<h:td>Aries</h:td>
<h:td>Bingo</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.xyz.com/furniture">
<f:name>Computer table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
55
XML Namespaces…
In the above example, the <table> element defines a namespace and
when a namespace is defined for an element, the child elements with
the same prefixes are associated with the same namespace.
<root xmlns:h="http://www.abc.com/TR/html4/"
xmlns:f="http://www.xyz.com/furniture">
<h:table>
<h:tr>
<h:td>Aries</h:td>
<h:td>Bingo</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>Computer table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
56
XML Namespaces…
Note: The Namespace URI used in the above example is not
necessary at all. It is not used by parser to look up information. It
is only used to provide a unique name to the Namespace
identifier.
57
The Default Namespace
It doesn’t allow you to use prefixes in all the child elements.
You can also use multiple namespaces within the same document just
define a namespace against a child node
Example:
<tutorials xmlns="http://www.javatpoint.com/java-tutorial">
<tutorial>
<title>Java-tutorial</title>
<author>Sonoo Jaiswal</author>
</tutorial>
...
</tutorials>
Note: If you define a namespace without a prefix, all descendant
elements are considered to belong to that namespace.
58