XML
Dr. Alekha Kumar Mishra
Introduction
● XML stands for Extensible Markup Language.
● It is a text-based markup language derived
from Standard Generalized Markup Language
(SGML).
● XML tags identify the data and are used to
store and organize the data
● In contrast, HTML tags are used to display the
data
Dr. Alekha Kumar Mishra
Use of XML
● To simplify the creation of HTML documents for
large web sites.
● Can be used to exchange the information between
organizations and systems.
● Can be used to store and arrange the data, which
can customize your data handling needs.
● Can easily be merged with style sheets to create
almost any desired output.
● Virtually, any type of data can be expressed as an
XML document.
3
Dr. Alekha Kumar Mishra
XML syntax
<?xml version="1.0"?>
<contact-info>
<name>Deepika Sharma</name>
<company>MakeMyTrip</company>
<phone>(011) 124-4887</phone>
</contact-info>
Dr. Alekha Kumar Mishra
XML Declaration
● The XML document can optionally have an XML
declaration.
● <?xml version="1.0" encoding="UTF-8"?>
● The XML declaration is case sensitive and must
begin with "<?xml>"
● If document contains XML declaration, then it strictly
needs to be the first statement of the XML document.
● An HTTP protocol can override the value of encoding
that you put in the XML declaration.
Dr. Alekha Kumar Mishra
Tags and Elements
● An XML file is structured by several XML-elements, also called XML-
nodes or XML tags.
– Example: <element>
● Each XML-element needs to be closed either with start or with end
elements
<element> ..... </element>, or
<element/>
● Elements can be nested but should not overlap each other
<?xml version="1.0"?>
<contact-info>
<company>TutorialsPoint</company>
<contact-info>
Dr. Alekha Kumar Mishra
Root element
● An XML document ● Correct
can have only one <root>
root element
<x>...</x>
<y>...</y>
● Incorrect
</root>
<x>...</x>
<y>...</y>
Dr. Alekha Kumar Mishra
References
● References usually add or include additional text or
markup in an XML document.
● References always begin with the symbol "&", which is
a reserved character and end with the symbol ";".
● XML has two types of references:
– Entity References: An entity reference contains a name
between the start and the end delimiters.
● Example: & where amp is name. The name refers to a
predefined string of text and/or markup.
– Character References: These contain references, such as
A, contains a hash mark (“#”) followed by a number.
● The number always refers to the Unicode code of a character.
8
Dr. Alekha Kumar Mishra
XML Document
● An XML document can contains wide variety of data
●
Two major sections
– Document Prolog Section (XML declaration and DTD)
– Document Elements section
● <?xml version="1.0"?>
<contact-info>
<name>Deepika Sharma</name>
<company>MakeMyTrip</company>
<phone>(011) 124-4887</phone>
</contact-info>
9
Dr. Alekha Kumar Mishra
XML Declaration
● XML declaration contains details that prepare an XML
processor to parse the XML document
● When it is used, it must appear in first line of the XML document
● Syntax <?xml version="ver_number"
encoding="encoding_standard"
standalone="standalone_status"
?>
● Examples:
– <?xml >
– <?xml version="1.0" ?>
– <?xml version="1.0" encoding="UTF-8" standalone="no" ?>
– <?xml version='1.0' encoding='iso-8859-1' standalone='no' ?>
10
Dr. Alekha Kumar Mishra
XML Element
● An element name can contain
<?xml version="1.0"?>
any alphanumeric characters.
<contact-info>
● The only punctuation marks
<address category="residence">
allowed in names are the
hyphen (-), under-score ( _ ) <name>Tanmay Patil</name>
and period (.) <company>TutorialsPoint</company>
● Names are case sensitive <phone>(011) 123-4567</phone>
● Start and end tags of an </address>
element must be identical </contact-info>
● An element, which is a
container, can contain text or
elements
11
Dr. Alekha Kumar Mishra
Attributes
● An attribute specifies a single property for the element,
using a name/value pair.
● An XML-element can have one or more attributes.
– <a b=”x” c=”y” d=”z”> .... </a>
● Attribute names in XML (unlike HTML) are case sensitive.
● Same attribute cannot have two values in a syntax.
– <a b="x" c="y" b="z">....</a> is incorrect
● Attribute values must always appear in quotation marks.
– <a b=x>....</a> is incorrect
12
Dr. Alekha Kumar Mishra
XML Attributes example
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE garden [
<!ELEMENT garden (plants)*>
<!ELEMENT plants (#PCDATA)>
<!ATTLIST plants category CDATA #REQUIRED>
]>
<garden>
<plants category="flowers" />
<plants category="shrubs"> </plants>
</garden>
13
Dr. Alekha Kumar Mishra
XML Comments
● XML comment has following syntax:
<!-- Your comment -->
14
Dr. Alekha Kumar Mishra
XML CDATA
● The term CDATA means, Character Data
● CDATA are defined as blocks of text that are not parsed by
the parser, but are otherwise recognized as markup.
● PCDATA is the text that will be parsed by a parser. Tags
inside the PCDATA will be treated as markup and entities
will be expanded.
● Syntax:
– <![CDATA[
– characters with markup
– ]]>
15
Dr. Alekha Kumar Mishra
XML CDATA
● Example:
<script>
<![CDATA[
<message> Welcome to TutorialsPoint
</message>
]] >
</script >
● Everything between <message> and </message>
is treated as character data and not as markup.
16
Dr. Alekha Kumar Mishra
XML Validaton
● Validation is a process by which an XML
document is validated.
● An XML document is said to be valid
– if its contents match with the elements, attributes and
associated document type declaration (DTD), and
– if the document complies with the constraints
expressed in it.
● Validation is dealt in two ways by the XML parser:
– Well-formed XML document
– Valid XML document
17
Dr. Alekha Kumar Mishra
Well-formed XML document
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE address
[
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<address>
<name>Deepika Sharma</name>
<company>MakeMyTrip</company>
<phone>(011) 124-4887</phone>
18
</address>
Dr. Alekha Kumar Mishra
XML DTD
● The XML Document Type Declaration,
commonly known as DTD, is a way to describe
XML language precisely.
● DTDs check vocabulary and validity of the
structure of XML documents against
grammatical rules of appropriate XML language
● Can be
– Internal DTD
– External DTD
19
Dr. Alekha Kumar Mishra
Internal DTD
● A DTD is referred to as an internal DTD if elements are
declared within the XML files
● Syntax:
– <!DOCTYPE root-element [element-declarations]>
● Example:
<!DOCTYPE address [
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
20
Dr. Alekha Kumar Mishra
External DTD
● In external DTD elements are declared outside the
XML file.
● They are accessed by specifying the system
attributes which may be either the legal .dtd file or
a valid URL.
● To refer it as external DTD, standalone attribute in
the XML declaration must be set as no
● Syntax:
– <!DOCTYPE root-element SYSTEM "file-name">
– <!DOCTYPE name PUBLIC "-//Beginning XML//DTD
Address Example//EN"> 21
Dr. Alekha Kumar Mishra
External DTD example
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE address SYSTEM "address.dtd">
<address>
<name>Deepika Sharma</name>
<company>MakeMyTrip</company>
<phone>(011) 124-4887</phone>
</address>
The content of the DTD file address.dtd:
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)> 22
Dr. Alekha Kumar Mishra
Extracting information from XML
DOM object with JavaScript
<!DOCTYPE html>
<html>
<body>
<h1>TutorialsPoint DOM example </h1>
<div>
<b>Name:</b> <span id="name"></span><br>
<b>Company:</b> <span id="company"></span><br>
<b>Phone:</b> <span id="phone"></span>
</div>
23
Dr. Alekha Kumar Mishra
<script>
Contd..
if (window.XMLHttpRequest) { // code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp = new XMLHttpRequest();
}
else{ // code for IE6, IE5
xmlhttp = new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.open("GET","address.xml",false);
xmlhttp.send();
xmlDoc=xmlhttp.responseXML;
document.getElementById("name").innerHTML=
xmlDoc.getElementsByTagName("name")[0].childNodes[0].nodeValue;
document.getElementById("company").innerHTML=
xmlDoc.getElementsByTagName("company")[0].childNodes[0].nodeValue;
document.getElementById("phone").innerHTML=
xmlDoc.getElementsByTagName("phone")[0].childNodes[0].nodeValue;
</script>
</body> </html> 24
Dr. Alekha Kumar Mishra
XMLHttpRequest Object
● The XMLHttpRequest object is used to
exchange data with a server behind the
scenes.
● It is possible to update parts of a web page,
without reloading the whole page.
– variable=new XMLHttpRequest();
● For older browser ActiveX Object is used.
25
Dr. Alekha Kumar Mishra
Send a Request To a Server
● open(method,url,async) method specifies
the type of request, the URL, and if the
request should be handled asynchronously or
not.
● send(string) sends the request off to the
server. string: only used for POST requests
26
Dr. Alekha Kumar Mishra
Server response and accessing
elements
● responseXML attribute of XMLHttpRequest
object gets the response data as XML data
● Accessing elements from responseXML
– xmlDoc.getElementsByTagName("name")
[0].childNodes[0].nodeValue;
27
Dr. Alekha Kumar Mishra
XML Schema
● XML Schema is commonly known as XML Schema
Definition (XSD)
● It describe and validate the structure and the content of
XML data.
● Schema element supports Namespaces.
● It is similar to a database schema that describes the data
in a database.
● Syntax: schema declaration
– <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
28
Dr. Alekha Kumar Mishra
XML Schema Example
● Simple Schema Definition type
<xs:element name="phone_number" type="xs:int" />
● Complex Schema Definition type
<xs:element name="Address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" />
<xs:element name="company" type="xs:string" />
<xs:element name="phone" type="xs:int" />
</xs:sequence>
</xs:complexType>
</xs:element>
29
Dr. Alekha Kumar Mishra
XML Parsers
● XML parser is a software library or a package
that provides interface for client applications to
work with XML documents.
● It checks for proper format of the XML
document and may also validate the XML
documents.
● Modern day browsers have built-in XML
parsers.
30
Dr. Alekha Kumar Mishra
Some XML Parsers
● MSXML (Microsoft Core XML Services) : This is a standard
set of XML tools from Microsoft that includes a parser.
– System.Xml.XmlDocument : This class is part of .NET library, which
contains a number of different classes related to working with XML.
● Java built-in parser : The library is designed such that you
can replace the built-in parser with an external implementation
such as Xerces from Apache or Saxon.
● Saxon : Saxon offers tools for parsing, transforming, and
querying XML.
● Xerces : Xerces is implemented in Java and is developed by
the famous open source Apache Software Foundation
31
Dr. Alekha Kumar Mishra
XSL
● XSL is a language for expressing style sheets.
● An XSL style sheet is, like with CSS, a file that describes how to
display an XML document of a given type.
●
XSL shares the functionality and is compatible with CSS2.
● It adds a transformation language for XML documents: XSLT.
– Now used as a general purpose XML processing language.
– XSLT is widely used for purposes other than XSL, like generating
HTML web pages from XML data.
● It also adds Advanced styling features, expressed by an XML
document type which defines a set of elements called
Formatting Objects, and attributes
32
Dr. Alekha Kumar Mishra
How it works
● Styling requires a source XML documents,
containing the information that the style sheet
will display and the style sheet itself which
describes how to display a document of a
given type.
33
Dr. Alekha Kumar Mishra
Example
The XML file
<scene>
<FX>General Road Building noises.</FX>
<speech speaker="Prosser">
Come off it Mr Dent, you can't win
you know. There's no point in lying
down in the path of progress. ...
</speech> <xsl:template match="FX">
<speech speaker="Arthur"> <fo:block font-weight="bold">
I've gone off the idea of progress. <xsl:apply-templates/>
It's overrated </fo:block>
</speech> </xsl:template>
</scene>
<xsl:template match="speech[@speaker='Arthur']">
<fo:block background-color="blue">
<xsl:value-of select="@speaker"/>:
<xsl:apply-templates/>
</fo:block>
</xsl:template>
...
34
Dr. Alekha Kumar Mishra
XSLT
● With XSLT (eXtensible Stylesheet Language Transformations) we can
transform an XML document into HTML.
● XSLT is the recommended style sheet language for XML.
● XSLT is far more sophisticated than CSS.
● With XSLT we can add/remove elements and attributes to or from the
output file.
● Other operations on elements includes
– Rearrange
– Sort
– Tests and make decisions about which elements to hide and display
– and a lot more.
● XSLT uses XPath to find information in an XML document.
35
Dr. Alekha Kumar Mishra
XLST Example
<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>French Toast</name>
<price>$4.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast</name>
<price>$6.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
36
</breakfast_menu>
Dr. Alekha Kumar Mishra
XLST Stylesheet Example
<?xml version="1.0" encoding="UTF-8"?>
<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<body style="font-family:Arial;font-size:12pt;background-color:#EEEEEE">
<xsl:for-each select="breakfast_menu/food">
<div style="background-color:teal;color:white;padding:4px">
<span style="font-weight:bold"><xsl:value-of select="name"/> - </span>
<xsl:value-of select="price"/>
</div>
<div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
<p>
<xsl:value-of select="description"/>
<span style="font-style:italic"> (<xsl:value-of select="calories"/> calories per serving) </span>
</p>
</div>
</xsl:for-each>
</body>
</html> 37
Dr. Alekha Kumar Mishra
XPath
● XPath is a major element in the XSLT
standard.
● XPath uses path expressions to select nodes
or node-sets in an XML document.
● XPath expressions can be used in JavaScript,
Java, XML Schema, PHP, Python, C and C++,
and lots of other languages.
38
Dr. Alekha Kumar Mishra
Xpath expression and results
XPath Expression Result_____________________________________________________
/bookstore/book[1] Selects the first book element that is the child of the bookstore element
/bookstore/book[last()] Selects the last book element that is the child of the bookstore element
/bookstore/book[last()-1] Selects the last but one book element that is the child of the bookstore
element
/bookstore/book[position()<3] Selects the first two book elements that are children of the bookstore
element
//title[@lang] Selects all the title elements that have an attribute named lang
//title[@lang='en'] Selects all the title elements that have a "lang" attribute with a value of "en"
/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that have a price
element with a value greater than 35.00
/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of the bookstore element
that have a price element with a value greater than 35.00
39
Dr. Alekha Kumar Mishra
XLink
● XLink is used to create hyperlinks within XML
documents
● Any element in an XML document can behave
as a link
● With XLink, the links can be defined outside
the linked files
● XLink is a W3C Recommendation
40
Dr. Alekha Kumar Mishra
XLink Syntax
● In HTML, the <a> element defines a hyperlink.
However, this is not how it works in XML.
● In XML documents, we can use whatever
element names we wish - therefore it is
impossible for browsers to predict what link
elements will be called in XML documents.
41
Dr. Alekha Kumar Mishra
<?xml version="1.0" encoding="UTF-8"?>
Xlink
Example
<bookstore xmlns:xlink="http://www.w3.org/1999/xlink">
<book title="Harry Potter">
<description
xlink:type="simple"
xlink:href="/images/HPotter.gif"
xlink:show="new">
As his fifth year at Hogwarts School of Witchcraft and
Wizardry approaches, 15-year-old Harry Potter is.......
</description>
</book>
<book title="XQuery Kick Start">
<description
xlink:type="simple"
xlink:href="/images/XQuery.gif"
xlink:show="new">
XQuery Kick Start delivers a concise introduction
to the XQuery standard.......
</description>
</book>
</bookstore>
42
Dr. Alekha Kumar Mishra
XPointer
● XPointer allows links to point to specific parts
of an XML document
● XPointer uses XPath expressions to navigate
in the XML document
● XPointer is also a W3C Recommendation
43
Dr. Alekha Kumar Mishra
Xpointer Example
<?xml version="1.0" encoding="UTF-8"?>
<dogbreeds>
● This XML document uses
<dog breed="Rottweiler" id="Rottweiler">
id attributes on each
<picture url="https://dog.com/rottweiler.gif" /> element
<history>The Rottweiler's ancestors were probably
Roman
drover dogs.....</history>
● So, instead of linking to the
<temperament>Confident, bold, alert and imposing,
the Rottweiler entire document (as with
is a popular choice for its ability to
protect....</temperament>
XLink), XPointer allows
</dog> you to link to specific parts
<dog breed="FCRetriever" id="FCRetriever"> of the document.
<picture url="https://dog.com/fcretriever.gif" />
<history>One of the earliest uses of retrieving dogs
was to
help fishermen retrieve fish from the water....</history>
<temperament>The flat-coated retriever is a sweet,
exuberant,
lively dog that loves to play and
retrieve....</temperament>
</dog>
</dogbreeds> 44
Dr. Alekha Kumar Mishra
Xpointer Example
● To link to a specific part of a page, add a number sign (#)
and an XPointer expression after the URL in the xlink:href
attribute,
– xlink:href="https://dog.com/dogbreeds.xml#xpointer(id('Rottweil
er'))".
– The expression refers to the element in the target document,
with the id value of "Rottweiler".
● XPointer also allows a shorthand method for linking to an
element with an id. we can use the value of the id directly,
– xlink:href="https://dog.com/dogbreeds.xml#Rottweiler".
45
Dr. Alekha Kumar Mishra
Xpointer Example
<?xml version="1.0" encoding="UTF-8"?>
<mydogs xmlns:xlink="http://www.w3.org/1999/xlink">
<mydog>
<description>
Anton is my favorite dog. He has won a lot of.....
</description>
<fact xlink:type="simple" xlink:href="https://dog.com/dogbreeds.xml#Rottweiler">
Fact about Rottweiler
</fact>
</mydog>
<mydog>
<description>
Pluto is the sweetest dog on earth......
</description>
<fact xlink:type="simple"
xlink:href="https://dog.com/dogbreeds.xml#FCRetriever">
Fact about flat-coated Retriever
</fact>
</mydog>
</mydogs>
46
Dr. Alekha Kumar Mishra
End of XML
47
Dr. Alekha Kumar Mishra