@@ -284,6 +284,71 @@ sub-elements for a given element::
284284 >>> ET.dump(a)
285285 <a><b /><c><d /></c></a>
286286
287+ Parsing XML with Namespaces
288+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
289+
290+ If the XML input has `namespaces
291+ <https://en.wikipedia.org/wiki/XML_namespace> `__, tags and attributes
292+ with prefixes in the form ``prefix:sometag `` get expanded to
293+ ``{uri}tag `` where the *prefix * is replaced by the full *URI *. Also,
294+ if there is a `default namespace
295+ <http://www.w3.org/TR/2006/REC-xml-names-20060816/#defaulting> `__,
296+ that full URI gets prepended to all of the non-prefixed tags.
297+
298+ Here is an XML example that incorporates two namespaces, one with the
299+ prefix "fictional" and the other serving as the default namespace:
300+
301+ .. code-block :: xml
302+
303+ <?xml version =" 1.0" ?>
304+ <actors xmlns : fictional =" http://characters.example.com"
305+ xmlns =" http://people.example.com" >
306+ <actor >
307+ <name >John Cleese</name >
308+ <fictional : character >Lancelot</fictional : character >
309+ <fictional : character >Archie Leach</fictional : character >
310+ </actor >
311+ <actor >
312+ <name >Eric Idle</name >
313+ <fictional : character >Sir Robin</fictional : character >
314+ <fictional : character >Gunther</fictional : character >
315+ <fictional : character >Commander Clement</fictional : character >
316+ </actor >
317+ </actors >
318+
319+ One way to search and explore this XML example is to manually add the
320+ URI to every tag or attribute in the xpath of a *find() * or *findall() *::
321+
322+ root = from_string(xml_text)
323+ for actor in root.findall('{http://people.example.com}actor'):
324+ name = actor.find('{http://people.example.com}name')
325+ print(name.text)
326+ for char in actor.findall('{http://characters.example.com}character'):
327+ print(' |-->', char.text)
328+
329+ Another way to search the namespaced XML example is to create a
330+ dictionary with your own prefixes and use those in the search::
331+
332+ ns = {'real_person': 'http://people.example.com',
333+ 'role': 'http://characters.example.com'}
334+
335+ for actor in root.findall('real_person:actor', ns):
336+ name = actor.find('real_person:name', ns)
337+ print(name.text)
338+ for char in actor.findall('role:character', ns):
339+ print(' |-->', char.text)
340+
341+ These two approaches both output::
342+
343+ John Cleese
344+ |--> Lancelot
345+ |--> Archie Leach
346+ Eric Idle
347+ |--> Sir Robin
348+ |--> Gunther
349+ |--> Commander Clement
350+
351+
287352Additional resources
288353^^^^^^^^^^^^^^^^^^^^
289354
@@ -366,6 +431,9 @@ Supported XPath syntax
366431| ``[tag] `` | Selects all elements that have a child named |
367432| | ``tag ``. Only immediate children are supported. |
368433+-----------------------+------------------------------------------------------+
434+ | ``[tag=text] `` | Selects all elements that have a child named |
435+ | | ``tag `` that includes the given ``text ``. |
436+ +-----------------------+------------------------------------------------------+
369437| ``[position] `` | Selects all elements that are located at the given |
370438| | position. The position can be either an integer |
371439| | (1 is the first position), the expression ``last() `` |
0 commit comments