CSE2045Y
Web Application Development
Lecture 3
XML (eXtensible Markup Language)
Agenda
• What is XML?
• HTML v/s XML
• Reasons to use XML
• How does XML work?
• XML Tree Structure
– XML Elements
– XML Attributes
• XML Rules
• XML Errors
2
What is XML?
• XML stands for eXtensible Markup Language.
• XML is a markup language much like HTML.
• XML was designed to store and transport data.
• XML was designed to be self-descriptive.
• XML is a W3C Recommendation.
HTML v/s XML (1)
• XML and HTML were designed with different
goals:
– XML was designed to carry data - with focus on
what data is.
– HTML was designed to display data - with focus on
how data looks.
– XML tags are not predefined like HTML tags are.
4
HTML v/s XML (2)
Technology HTML XML
Use Display data Describe, store
and transfer data
Data formatting CSS XSL
Constraints/Rules None XSD
Linking to other <a href= > XPath
documents
Reasons to use XML
1. Self-Describing data
– User-defined tags (eXtensible)
– E.g. <name>Armadillo</name>
2. Interchange of data among applications
– Non-proprietary, portable, use any tool that
understands xml
3. Structured Data
– Specify the relations between elements
– E.g. Name must consist of First, Middle and Last
name.
6
XML Example 1
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XML Example 2
<?xml version="1.0" encoding="UTF-8"?>
<breakfast>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>
Two of our famous Belgian Waffles with plenty of real maple syrup
</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>
Light Belgian waffles covered with strawberries and whipped cream
</description>
<calories>900</calories>
</food>
</breakfast> 8
How Can XML be used?
• XML separates Data from Presentation
– XML does not carry any information about how to be
displayed.
– The same XML data can be used in many different
presentation scenarios.
• XML is often a Complement to HTML
– In many HTML applications, XML is used to store or
transport data, while HTML is used to format and display
the same data.
• XML separates Data from HTML
– With a few lines of JavaScript code, you can read an XML
file and update the data content of any HTML page.
9
XML Tree Structure
• XML documents form a tree structure that
starts at "the root" and branches to "the
leaves".
10
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
11
Viewing the XML data
• The XML data
can be displayed
in a browser
window.
12
XML Errors
• If the XML is not well-formed, the error will be
hignlighted.
13
XML Tree Elements (1)
• XML documents are formed as element trees.
• An XML tree starts at a root element and
branches from the root to child elements.
• All elements can have sub elements (child
elements):
14
XML Tree Elements (2)
• The terms parent, child, and sibling are used to
describe the relationships between elements.
• Parent have children. Children have parents.
Siblings are children on the same level (brothers
and sisters).
• All elements can have text content (Harry Potter)
and attributes (category="cooking").
15
Self-Describing Syntax (1)
• A prolog defines the XML version and the
character encoding.
– The XML prolog is optional. If it exists, it must come
first in the document.
• The next line is the root element of the
document.
• The next line starts a <book> Element.
16
Self-Describing Syntax (2)
• The <book> elements have 4 child elements:
<title>, <author>, <year>, <price>.
• The last line ends the book element.
17
XML Entity References (1)
• Some characters have a special meaning in
XML.
• If you place a character like "<" inside an XML
element, it will generate an error because the
parser interprets it as the start of a new
element.
• To avoid this error, replace the "<" character
with an entity reference:
18
XML Entity References (2)
• There are 5 pre-defined entity references in
XML:
19
Activity 1
• Given the following XML Tree diagram, write
the XML.
https://www.tutorialspoint.com/xml/images/tree_structure.jpg 20
XML - is that it?
• Well that was fairly easy, most probably easier
than HTML since I do not have to remember tags
like <h1>, <br/>, <div> and so on
• Question
– How do we validate whether the XML is correct?
• XML should be well-formed
– XML should therefore follow certain rules:
• Syntax rules
• Naming rules
21
XML Syntax rules (1)
https://www.tutorialspoint.com/xml/images/syntaxrules.png
22
XML Syntax Rules (2)
• XML Documents Must Have a Root Element.
• All XML Elements Must Have a Closing Tag.
• XML Tags are Case Sensitive.
• XML Elements Must be Properly Nested.
• XML Attribute Values Must be Quoted.
23
XML Naming Rules
• Element names are case-sensitive.
• Element names must start with a letter or
underscore.
• Element names cannot start with the letters
xml (or XML, or Xml, etc).
• Element names can contain letters, digits,
hyphens, underscores, and periods.
• Element names cannot contain spaces.
24
XML Elements
• An XML element is everything from (including)
the element's start tag to (including) the
element's end tag.
• An element can contain:
– text
– attributes
– other elements
– or a mix of the above
25
XML Elements Examples
• <bookstore> is the root element.
• <title>, <author>, <year>, and <price> have
text content.
• <bookstore> and <book> have element
contents
26
Activity 2
• Which of these XML elements are invalid? State
why?
<123>
<author>John Smith </Author>
<xml-tag>
<my tag>
27
XML Attributes
• Attributes are name/value pairs associated with
an Element.
• Value must be in quotes (either single or double
quotes).
– We can use single on some attributes and double on
others, but you can't mix them in a single attribute.
• No element can contain two attributes with the
same name.
• XML elements can have more then one
attributes.
28
XML Attributes Examples
<book category="cooking">
<male gender="male">
<male gender='male'>
• <male gender='male"> wrong statement
(mismatching delimiters).
29
Activity 3
• Which of these XML attributes are invalid? State why?
<input checked>
<input checked='true">
<test myAttr='some data goes here >some other
data</test>
<name first="John" middle="Fitzgerald Johansen"
last="Doe"></name>
30
XML Comments
• Comments start with the string <!--
• and end with the string -->
• Example:
<name nickname= Shiny John >
<first>John</first>
<!--John has no middle name -->
<middle></middle>
<last>Doe</last>
</name>
31
XML Empty Element
• <middle></middle> can be represented as
<middle/>.
• This is the one case where a start-tag doesn t
need a separate end-tag, because they are
combined into this one tag.
• In all other cases, you must have both tags.
32
Activity 4
• Rewrite the following XML fragment using
attributes only, thus making it an empty element.
<name>
<first>John</first>
<middle>Fitzgerald Johansen</middle>
<last>Doe</last>
</name>
33
XML Errors
• An error is simply a violation of the rules in the
recommendation, where the results are
undefined; the XML processor is allowed to
recover from the error and continue processing.
• 5 Common XML Errors:
1. Forgetten Declaration Statement
2. Unnested Elements or Text
3. Open/Close Tags mismatch
4. No Root Element
5. Multiple White-Space Characters
34
XML Validation (Next Week)
• The minimal requirement for an XML document is
well-formedness !
• Even if a document is well-formed, however, it
may not be valid.
• A well-formed XML document may be valid if it
meets certain further constraints (validate against
a XML schema (XSD))
• Validation will be covered in the XSD lecture.
35
References
• https://www.w3schools.com/xml
• https://www.tutorialspoint.com/xml/xml_tree
_structure.htm
• http://www.w3webtutorial.com/xml/xml-
attribute.php
36