HL7 - Encoding
HL7 - Encoding
(http://www.hl7.org/hl7v2+)
The Demo site for our new HL7 Version 2+ (plus) Standard
(http://www.hl7.org)
8 Encoding (92)
TBD (92.00001): Short Introduction to Encoding
https://www.hl7.eu/refactored/encoding02xml.html 1/31
6/15/24, 3:33 PM HL7 - REFACTORED
HL7 Version 2: XML Encoding Rules, Release 2(revision of ANSI/HL7 V2 XML-2003 (R2010))
Release 2
https://www.hl7.eu/refactored/encoding02xml.html 2/31
6/15/24, 3:33 PM HL7 - REFACTORED
Copyright © 2012 Health Level Seven International ® ALL RIGHTS RESERVED. The reproduction of this material in any
form is strictly forbidden without the written permission of the publisher. HL7 International and Health Level Seven are
registered trademarks of Health Level Seven International. Reg. U.S. Pat & TM Off.
IMPORTANT NOTES:
HL7 licenses its standards and select IP free of charge. If you did not acquire a free license from HL7 for this document,
you are not authorized to access or make any use of it. To obtain a free license, please visit
http://www.HL7.org/implement/standards/index.cfm.
If you are the individual that obtained the license for this HL7 Standard, specification or other freely licensed work (in each
and every instance "Specified Material"), the following describes the permitted uses of the Material.
A. HL7 INDIVIDUAL, STUDENT AND HEALTH PROFESSIONAL MEMBERS, who register and agree to the terms of
HL7�s license, are authorized, without additional charge, to read, and to use Specified Material to develop and sell
products and services that implement, but do not directly incorporate, the Specified Material in whole or in part without
paying license fees to HL7.�
https://www.hl7.eu/refactored/encoding02xml.html 3/31
6/15/24, 3:33 PM HL7 - REFACTORED
INDIVIDUAL, STUDENT AND HEALTH PROFESSIONAL MEMBERS wishing to incorporate additional items of Special
Material in whole or part, into products and services, or to enjoy additional authorizations granted to HL7
ORGANIZATIONAL MEMBERS as noted below, must become ORGANIZATIONAL MEMBERS of HL7.
B. HL7 ORGANIZATION MEMBERS, who register and agree to the terms of HL7's License, are authorized, without
additional charge, on a perpetual (except as provided for in the full license terms governing the Material), non-exclusive
and worldwide basis, the right to (a) download, copy (for internal purposes only) and share this Material with your
employees and consultants for study purposes, and (b) utilize the Material for the purpose of developing, making, having
made, using, marketing, importing, offering to sell or license, and selling or licensing, and to otherwise distribute,
Compliant Products, in all cases subject to the conditions set forth in this Agreement and any relevant patent and other
intellectual property rights of third parties (which may include members of HL7). No other license, sublicense, or other
rights of any kind are granted under this Agreement.
C. NON-MEMBERS, who register and agree to the terms of HL7�s IP policy for Specified Material, are authorized,
without additional charge, to read and use the Specified Material for evaluating whether to implement, or in implementing,
the Specified Material, and to use Specified Material to develop and sell products and services that implement, but do not
directly incorporate, the Specified Material in whole or in part.
NON-MEMBERS wishing to incorporate additional items of Specified Material in whole or part, into products and services,
or to enjoy the additional authorizations granted to HL7 ORGANIZATIONAL MEMBERS, as noted above, must become
ORGANIZATIONAL MEMBERS of HL7.
Please see http://www.HL7.org/legal/ippolicy.cfm for the full license terms governing the Material.
ITS WG + CGIT WG
https://www.hl7.eu/refactored/encoding02xml.html 4/31
6/15/24, 3:33 PM HL7 - REFACTORED
This document supersedes Release 1 and contains additional specifications to accommodate new features introduced
beginning HL7 Version 2.3.1, e.g. the use of choices within message structures. As of the time of this writing the current
version is v2.7. This document is valid for all v2.x versions which have passed ballot. Chapter 2 of the HL7 Version 2.3.1
and 2.7 [rfHL7v231, rfHL7v27] specifies standard message structures (syntax) and content (semantics), the message
definitions. It also specifies an interchange format and management rules, the encoding rules for HL7 message instances
(see Figure 1). The objective of this document is to present alternate encoding rules for HL7 Version 2.3.1 to 2.7
messages (and a mechanism for determining alternate encoding rules for subsequent HL7 2.x versions) based on the
Extensible Markup Language XML [rfXML] that could be used in environments where senders and receivers both
understand XML.
It is not the intent of this document to replace the standard sequence oriented encoding rules, that use �vertical bars�
and other delimiters (so called �vertical bar encoding�), but rather to provide an alternative way of encoding.
Furthermore, message definitions given in the Version 2.x standard are also untouched. However, if you are going to use
XML for version 2.x messages, this HL7 normative document describes how to do that. This document does not modify
the message definitions, only the way they are encoded.
In principle, many XML encodings could serve as alternate messaging syntaxes for HL7 Version 2.x messages. This
document describes the one suggested and standardized by HL7. It primarily addresses the translation between standard
encoded and XML encoded HL7 version 2.x, describing the underlying rules and principles. XML schema [rfXMLSchema]
definitions are provided for all version 2.x messages types. Due to their greater expressiveness, schemas are the
preferred way to describe a set of constraints on message instances. The outdated Document Type Definitions (DTDs) are
not addressed any more. The algorithms used for this specification to derive the database excerpts and to create schemas
are also presented in the informative appendix.
This document is the normative successor of the first release (2003) and the informative document �HL7
Recommendation: Using XML as a Supplementary Messaging Syntax for HL7 Version 2.3.1 - HL7 XML Special Interest
Group, Informative Document� as of February, 2000 (rfINFO)The former document is replaced by this specification, at
the moment this document is successfully balloted.
This document assumes a basic understanding of HL7 version 2. However, some background information has been
included to aid those without version 2 experience.
This document is the second release of this specification to capture enhancements to the standard. As such, I wish to
thank Kai Heitmann who has written the first release.
This standard is the result of about two years of intense work through e-mail, telephone conferences and meeting
discussions. I wish to thank Bob Dolin and Paul Biron, who wrote the Informative Document.
This work was made possible by Frank Oemig, Lloyd McKenzie, Vassil Peytchev, Ralf Schweiger, Joachim Dudeck, and
Wes Rishel. Valuable discussions came from James Case, Ivan Emelin, Susan Abernathy, Peter Rontey, Nick Radov,
John Firl, Jennifer Puyenbroek, Chuck Meyer, Tim Barry, Jacub Valenta, Eliot Muir, Grahame Grieve, Koo Weng On,
Andrew Hinchley, Dennis Janssen. Special thanks for his support to Tom de Jong.
Thanks also to all members of the ITS Work Group and the InM Work Group for their input during the development
process.
General Knowledge
This specification assumes general knowledge of XML technology on the part of readers. Readers unfamiliar with XML
may gain the requisite knowledge from the following standards:
https://www.hl7.eu/refactored/encoding02xml.html 5/31
6/15/24, 3:33 PM HL7 - REFACTORED
Accompanying Material
Disclaimer
The reader is reminded that both examples and XML schema fragments presented within the document for illustrating
purposes are informative and do not form a part of the normative content.
9 Introduction (1)
9.1 Background (1.1)
In 1993, the European Committee for Standardization (CEN) studied several syntaxes (including ASN.1, ASTM, EDIFACT,
EUCLIDES, and ODA) for interchange formats in healthcare (rfCEN)A subsequent report extended the CEN study to look
at SGML (rfDolin1997)By using the same methodology, example scenarios, healthcare data model, and evaluation
metrics, the report presented a direct comparison of SGML with the other syntaxes studied by CEN, and found SGML to
compare favorably.
In February 1998, XML became a recommendation of the World Wide Web Consortium (W3C). XML was further tested as
a messaging syntax for HL7 Version 2.x and Version 3 messages (rfDolin1998)In 1999, Wes Rishel coordinated a 10-
vendor HL7-XML interoperability demonstration at the annual HIMSS Conference. All vendors rated the demo a success.
In 1999, the XML SIG developed an informative document in cooperation with Control/Query TC �HL7 Recommendation:
Using XML as a Supplementary Messaging Syntax for HL7 Version 2.3.1 - HL7 XML Special Interest Group, Informative
Document� that was approved as an HL7 Informative Document on membership level in February, 2000.
In August, 2000, at the HL7 Board Retreat meeting in Dresden (Germany), it was decided that XML will become the 2nd
normative encoding for versions 2.3.1 and 2.4 and future 2.x versions, i. e., the XML syntax that will be submitted for ANSI
approval and that has the same status as the traditional syntax. Another reason for a normative XML syntax is to support
future Claims Attachment messages, which are currently using v2.4 encoding.
Enhancing v2.x even further with v2.6 and v2.7 new concepts have been introduced which require an enhancement of this
specification.
This document stays with the original strategy for the representation of XML instances for backward compatibility.
The ability to explicitly represent an HL7 requirement in XML confers the ability to parse and validate messages with any
XML parser. Many �off-the-shelf� XML tools are available (freeware and commercial) such as parsers, transformation
applications and instance viewers, which can perform much of the validation of message/document instances, so that
applications don't have to. For the encoding part, trained personnel are much easier to find if using XML than experts
familiar with vertical bar encoding rules. Of course explicit knowledge about the underlying semantic assumptions is still
essential.
Frequently, a typical healthcare messaging application includes an in-house developed parser (message reader) and
generator (message writer) to process traditional (�vertical bar� encoded) HL7 messages with an almost certain
negative impact on development and maintenance costs. The only alternative to in house tool development which quite
often is not implemented correctly and completely is to choose from among the limited but often expensive commercial
https://www.hl7.eu/refactored/encoding02xml.html 6/31
6/15/24, 3:33 PM HL7 - REFACTORED
tool sets. Increasing, the traditional encoding often contributes to the isolation of healthcare from the generic data
interchange approaches used by other business areas. Adoption of across the board generic messaging encoding will
become critical for cost and error reduction as healthcare and other areas of business increase their daily interactions.
Using XML message parsers and generators will undoubtedly help to prepare healthcare for this growing challenge to
increase data interchange commonality with other business areas.
Finally, an XML syntax for v2.x messages will also help vendors and providers transition from HL7 Version 2 family of
standards to Version 3 by encouraging the early retooling of applications to support XML interfaces.
Underlying the HL7 2.x messaging Standards is a Microsoft Access database (the "HL7 Database") that contains a copy
of the official definitions of events, messages, segments, fields, data types, data type components, tables, and table
values. The database is designed to have the same content and is used to accurately reflect on what is given in the paper
based standard documents and, in addition, on what the membership voted on and including technical correction.
This database arose as the German HL7 user group undertook careful analysis of the standard. They became aware that
the chapters of the standard had been developed by different groups, and that there had been no distinct rules or
guidelines for the development of various parts of the standard. They therefore defined a comprehensive database of the
HL7 Standard (including Version 2.1 through Version 2.7 for now) to allow consistency checks of items and to support the
application of the standard by the user. All data were drawn from the normative standard documents, largely
algorithmically and to some minor amount handcrafted.
Within the HL7 Database, all data added is checked for its consistency. Referential integrity among relations assures this
consistency. The side effect of referential integrity is to modify the data from the standard documents because the
standard is defined in the form of a document but not in the form of a relational database.
As a consequence, the database is not an identical equivalent to the standard, but the differences are documented and
reflected as technical corrections and new proposals.
While developing the analytic object model for the definition of the comprehensive HL7 Database, the German HL7 user
group became aware that two problems are not handled satisfactorily in the standard:
Further details of the HL7 Database as well as known problems encountered in the construction of the database have
been documented by Frank Oemig et al. ([rfOemig1996], see also [rfOemig]). Most of the problems have been solved with
newer releases of the v2.x standard in the meantime. However, the database has been constructed to maintain all
versions and perhaps derivations thereof in parallel.
Ambiguities or errors in the standard are reflected �as is� in the XML encoding. Fixing any such errors in the XML will
require making appropriate technical corrections to the HL7 Database. There have been many such fixes, both in the
database and in the XML encoding since the last ballot cycle (committee level ballot). The procedures for deriving the
schemas are described in the informative appendix.
It should be mentioned that the database itself or extracts of the database are not needed in order to implement or use the
XML encoding of version 2 messages as described in this specification. The database and its excerpts are used for the
schema creation process only. Implementers should be able to develop v2.xml interfaces having only the schemas and
https://www.hl7.eu/refactored/encoding02xml.html 7/31
6/15/24, 3:33 PM HL7 - REFACTORED
the printed version of both this specification and the HL7 standard. Implementers may also choose to hand-generate or
adjust existing schemas to reflect localizations such as Z-segments.
If a supplier claims conformance for V2 messages in XML the messages must be valid against schemas produced from
the HL7 specification by the rules in the v2.xml specification.
HL7 defines the content of the message as an abstract set of data elements contained in data segments. Segments are
ordered sequences of fields and can be declared as required or optional and repeatable or non-repeating. Each segment
begins with a threecharacter literal value that identifies it within a message (segment identifier). For example, an ADT
message may contain the following segments: Message Header (MSH), Event Type (EVN), Patient ID (PID), and Patient
Visit (PV1).
The semantic content of a message is transferred in the fields of the segment. Fields can be of variable length. Field
contents can be required or optional, individual fields may be repeated. Individual data fields are found in the message by
their position within their associated segments. Multi-component fields are used for further subdivision of a field and
facilitate the transmission of locally related semantic contents.
For each field or field component, a data type is defined. Simple data types include string of characters, number, code etc.
Complex data types are comprised of two or more components. Examples are the CE data type (coded elements) which
components are �coded value�, �code designator� and �code system�, or XPN data type (extended person name),
which has several components that are each comprised of several sub-components in order to express the various parts
of a person�s name.
Groups with more than a single segment are handled in a special way in this specification (see section �2.4.1.
Optional/Repeating Groups of Segments�), because they are named. Such segment group names are uppercase (e. g.
�PROCEDURE�, �INSURANCE�) and do not contain spaces or other special characters.
https://www.hl7.eu/refactored/encoding02xml.html 8/31
6/15/24, 3:33 PM HL7 - REFACTORED
The brackets and braces in the Abstract Message Syntax relate to XML occurrence indicators as shown in the following:
[]
0 .. 1
{}
1 .. unbounded
{[ ]} = [{ }]
0 .. unbounded
1 .. 1 (complexType Choice)
no bracket or brace
1 .. 1
10 Specification (2)
10.1 Introduction to the XML Representation (2.1)
The XML encoding rules specified here represents HL7 message structures as XML elements. Message structures
contain segments, also represented as XML elements. Segments contain fields, again represented as XML elements. A
field's data type is stored as a fixed attribute in the field's attribute list, while a field's content model contains the data type
components. Other fixed attributes are used to expand abbreviations and indicate HL7 Table value restrictions. In addition,
the XML schema annotation mechanism is used to provide the same information, as represented in the fixed attributes of
field and data type definition (please refer to section �2.5. Fields� and �2.6. Data Types� for details).
Here is the same message in the syntax of the recommended XML encoding rules:
As is always the case with XML when processed with a validating processor, the extra white space between elements and
line breaks (provided to make the message easier for people to read) can be removed in actual message instances,
resulting in shorter messages in situations when overall message length is a factor.
The next section describes the stepwise creation of the XML representation.
https://www.hl7.eu/refactored/encoding02xml.html 9/31
6/15/24, 3:33 PM HL7 - REFACTORED
The v2.xml schemas (see also section�3.1.2. List of Schemas�) are based on the described message structure ID.
Looking at message definitions in 2.3.1 and later, the abstract message definition (see example a in Figure 3) and the
MSH-9 field (see example b in Figure 3) contain the message type, trigger event, and the message structure ID for the
message, e. g., ADT^A04^ADT_A01. This indicates that the ADT message with trigger event A04 has the message
structure ID ADT_A01 (i.e., it has the same sequence and cardinality of segments). All messages with that structure ID are
structurally the same, though they differ in the semantics of the event (A04 in the example case). In detail, message
structure code ADT_A01 describes the single abstract message structure used by the trigger events A01, A04, A05, A08,
A13, A14, A28 and A31.
As a consequence, encoding an A04 message, which has the ADT_A01 message structure, requires using the schema
definition for the ADT_A01 message. The standard documents contain tables where the message structure IDs are listed
(see section �3.1.1. List of Messages With Equal Message Structures�).
The message structure ID is used as a root element for the XML instance documents. As an example the corresponding
XML message fragment is shown below. The element carries the segment elements (see following section) as child
elements.
Considering the ADT_A04 example above, the corresponding XML message fragment is shown below. The element for
example carries the corresponding field elements (see following section) as child elements.
Groups containing more than a single segment are thus handled in a special way in this specification. For example in
Figure 4, a group is denoted by [{ PR1 [{ ROL }] }]. This group is named �PROCEDURE� (see 2nd column in Figure 4
containing �--- group_name begin/end�). Another example is the [{ IN1 [ IN2 ] [{ IN3 }] [{ ROL }] }] group which is named
https://www.hl7.eu/refactored/encoding02xml.html 10/31
6/15/24, 3:33 PM HL7 - REFACTORED
�INSURANCE�. These names also appear in the v2.xml schema definitions of the corresponding messages and thus
have to appear also in an XML message instance containing messages of that type, i.e. groups of segments are
surrounded with their own tags.
There was no explicit way to express these groups in the traditional v2 �vertical bar� encoding of messages.
Introduction of the explicit segment group names marks a major difference between vertical bar and XML encoding.
Furthermore, this allows elements to be accessed in a reasonable manner within an X-Path expression (see [rfXPATH]).
By this, an application can refer to specific XML items explicitly by name (e.g. ADT_A01/PROCEDURE/PR1.3/CNE.1) or
they can refer to them by position (e.g. ADT_A01/PROCEDURE/PR1.3/*[position()=1]). By taking the latter approach, one
no longer has to take care what the name of the field, data element or data type is. See also section on 0 data types.
Segment group names are uppercase. In almost all cases the names convey the semantics carried by the group of
segments itself, for example INx segments are bundled by the �INSURANCE� group, PV1 PV2 segments are bundles
as the �VISIT� group etc.
Please note: The narrative segment group names where this specification makes use of are neither in the paper version of
v2.3.1 nor v2.4. They are drawn from the v2.5 specification.
About 400 different groups of that kind could be identified in the standard. Some of the groups have identical content
concerning segment sequence, some of the contained segments, however, have different cardinalities. As an example the
group �INSURANCE� could be found in ADR_A19, ADR_A01, ADR_A05, ADR_A06 etc. but the single segments IN1,
IN2 etc. have different cardinalities within these groups. Consequentially, the v2.xml XML schema segment group naming
convention has adopted the use of the owning message structure id as a prefix for the group name to insure uniqueness
in regard to content.
Considering the ADT_A04 example above, the corresponding XML message fragment with groups is shown below.
The corresponding schema definition fragment for the ADT_A01 message is shown below.
As an example, the corresponding schema definition fragment for the EVN segment is shown below. Please note that,
consistent with the processing rule for v2 whereby receivers are to ignore fields not expected, the schemas will also allow
additional elements at the end of a segment.
Please note, that this vertical bar is independent from the vertical bar in the conventional encoding reflecting the standard
field delimiter.
As an example, the corresponding schema definition fragment for the CLINICAL_HISTORY_OBJECT choice group is
shown below.
https://www.hl7.eu/refactored/encoding02xml.html 11/31
6/15/24, 3:33 PM HL7 - REFACTORED
Multi-component fields are used for further subdivision of a field and facilitate the transmission of locally related semantic
contents.
In the v2.xml specification, individual fields are represented by three-character literal segment ID of the corresponding
segment plus their individual position within the segment (sequence). The first field (Event Type Code) in segment EVN for
example is named EVN.1, the second EVN.2 etc. An example of an EVN segment, traditionally encoded and v2.xml
encoded is shown below. Please note that the EVN encoding contains time stamp representation (TS) in EVN.2, EVN.3
and EVN.6 which are not primitive but composite data types and which are expressed in a way described in detail in
section �2.6.2. Composite Data Types�.
In the traditional sequence oriented approach, empty fields (containing no data) are denoted as two vertical bars �||� in
sequence to express the empty contents. This is essential in sequence-oriented approaches. In v2.xml an element with no
contents simply can be omitted (unless explicit use of the "" is required to force a data delete action by the receiving
application, see section �2.7.6. Delete Indicators, Empty Values�). In the example above there is no information for
EVN.5, thus the element is omitted in the corresponding XML instance.
The content model of each field is a reference to the field�s data type. In the XML schemas, the component�s item
number, table reference, long name, and data type is provided by the mechanism, in addition a tag is given containing the
long name of the field (also the language is defined by the standard xml:lang attribute) as specified in the standard. In
addition, the same information is provided as fixed attributes.
The example below shows the XML schema definition of the EVN.1 field element along with its annotations.
If a receiver receives an XML instance that is validated against the schema, the receiving parser can make use of the
information that is provided in the annotations appinfo (application information) and documentation (user information)
element content of the underlying schema.
The constraints on minLength and maxLength allow for a schema-based validation. Nevertheless, providing this
information as attributes in messages instances may be useful as well. Therefore, both options are valid. However, for
backward compatibility reasons the new constraints should not be used with HL7 versions prior to 2.7, although this is a
matter of negotiation between trading partners..
https://www.hl7.eu/refactored/encoding02xml.html 12/31
6/15/24, 3:33 PM HL7 - REFACTORED
A field for which a primitive data type is defined simply contains the information without additional nesting or hierarchy. As
an example the 4th field of the EVN segment (see Figure 5) is of type IS (a value drawn from an HL7 defined table), which
is a primitive data type. The corresponding XML instance fragment looks like the following example:
The v2.xml schema definitions define all primitive data types as �string� (XML schema).
Analogous to field components, data types components are modeled by specifying the data structure name plus their
individual position within the data type component (sequence). As an example, the first component of data type CNE is
defined as CNE.1, the second as CNE.2 and so on. This allows individual access to any of the components of a
composite data type. The following example shows a CNE data type encoded traditionally (�vertical bar�) and as v2.xml
fragment.
Also, empty components may be omitted in the v2.xml encoding, whereas empty components in the traditional encoding
must specify an empty component by two component delimiters �^^�in sequence in order to preserve sequence.
Where a field has a data type with multiple components but only a single component is populated with information, the
corresponding data type element of the component may not be omitted.
Considering the following example where a field of type CWE carries information in the first component only (i. e. the
identifier of a coded element), the correct v2.xml encoding is shown as in the following example with an OBX.3 field:
Incorrect:
Correct:
Data type components of composite data types are modeled similarly to fields. The content model of each component
contains reference to the component's data type. Annotation mechanism is used to express the component�s data type,
long name, and table, as shown here for CNE.1. In addition, the same information is provided as fixed attributes.
https://www.hl7.eu/refactored/encoding02xml.html 13/31
6/15/24, 3:33 PM HL7 - REFACTORED
In v2.5, new data types were created for (and applied to) all existing fields/components using the CM data type. An
addendum, for XML encoding, was applied to HL7 v2.3.1 and v2.4, where these renamed data types are listed. These
corrected names must be used when encoding CM data types with XML.
The sender of a v2.xml XML message is required to create both well-formed and valid message instances. The instances
created should be valid against the corresponding XML schema definitions (see section �3.1.2. List of Schemas�).
However, this does not necessarily imply validation of the transaction at run time. The decision to do so and incur
associated overhead should be made on a site-by-site basis or on interface development status.
The receiver who accepts a v2.xml XML message is required to check well-formedness of the XML instance. He may (but
is not required to) validate the message against the schema.
However, it should be easy to achieve XML transformations from an XML instance for one version to another using
corresponding transformation rules or style sheets (which are not provided here).
https://www.hl7.eu/refactored/encoding02xml.html 14/31
6/15/24, 3:33 PM HL7 - REFACTORED
Chapter 2 of the standard defines such a mechanism to wrap multiple valid HL7 messages by wrapping control segments
in order to form a batch of messages. For that purpose specific file and batch header and trailer control segments FHS,
FTS, BHS, BTS are defined.
In the XML encoding, it is also possible to wrap multiple messages with the corresponding control segments. The
definitions can be found in the messages schema (batch.xsd). For queries there is the need to define a QPD segment
differently for one query to a different query. The only way to support batches of queries (e. g. for non time critical
processing) or responses is to wrap the contents of the batch tags as CDATA. This approach has been used for the
general definitions of batch message �payload�, regardless of containing query segments or not.
For further information on batch messages refer to Chapter 2 and 5 of the standard.
At any given site, the subset of the possible delimiters may be limited by negotiations between applications. This implies
that the receiving applications will use the agreed upon delimiters, as they appear in the Message Header Segment
(MSH), to parse the message.
In the v2.xml encoding the message delimiter characters are contained in the MSH.1 and MSH.2 element of the MSH
segment as well. Although the message delimiter characters are meaningless in the v2.xml encoding, they are
represented as shown in the example fragment of the MSH segment. However, they can be useful when translating from
vertical bar to XML representation and vice versa. They must still be sent, because MSH.1 and MSH.2 are required fields
in the v2.x standard. Please note, that the special character �&� must be escaped in order to be included in an XML
message instance (see also section �2.7.10. Special Characters in Schemas�).
If the state of a blank or null data field cannot be determined, the sending system will send the empty value or omit the
element at all. An encoded field with an empty value or a missing element would instruct the receiving system to bypass
processing and does not affect an already existing value in the corresponding receiving database.
The occurrence of an empty element is treated as not existing to keep backward compatibility with ER7.
The following example carries a delete indication in the data type component. Explicit empty (missing) values are
expressed by empty (missing) element content. In the example, is omitted (empty).
https://www.hl7.eu/refactored/encoding02xml.html 15/31
6/15/24, 3:33 PM HL7 - REFACTORED
example 1
example 2
example 3
For the XML encoding we must differentiate between data type associated escape characters (text formatting), structural
escape sequences and character encoding / character set switching characters. They have to be handled differently when
using v2.xml.
There is also the possibility of specifying troff commands in text fields. They are escaped accordingly. The following table
just shows examples and is not complete. Please refer to chapter 2 of the v2.x standard.
An example:
is expressed in v2.xml as
https://www.hl7.eu/refactored/encoding02xml.html 16/31
6/15/24, 3:33 PM HL7 - REFACTORED
The vertical bar character encoding mechanism using \Xxxx\ as a character reference, and \Zxxx\ to refer to a locally
defined character reference is deprecated in the v2.xml encoding. Instead, the standard XML character reference
mechanism for UTF-8 must be used. Even non-printable characters like form feed can be represented that way.
For locally defined character references outside that scope, the private area of Unicode should be addressed.
A receiver who accepts a v2.xml XML message is required to check well-formedness of the XML instance and may (but is
not required to) validate the message against the schemas. As described in chapter 2 of the standard, the receiver
but in terms of validating against the v2.xml schema definition, the cardinality of the components is determined by the
v2.xml schema.
Please note, that, in correspondence of what the processing rules for v2 say for additional stuff after a segment, the
schemas also allow any elements following after the end of a segment.
Where an ampersand occurs in the long name of a field, it is converted to an XML entity representation �&� An example
is �Critical Range for Ordinal & Continuous Obs� that becomes �Critical Range for Ordinal & Continuous Obs�.
Because the Schema wraps the value of attribute LongName in single quotes, when a single quote occurs in the long
name of a field, it is converted to an XML entity representation �'�, e. g. �Contact's Tel. Number� becomes �Contact's
Tel. Number�.
Please note, that spelling and capitalization of all tags in the XML encoding must be the same as defined in the HL7
database (see section �1.3. XML representation derivation from HL7 Database�). Please refer to the schemas, which
reflect these rules.
https://www.hl7.eu/refactored/encoding02xml.html 17/31
6/15/24, 3:33 PM HL7 - REFACTORED
Because of several important differences between the standard encoding and this XML encoding, translations between
the two encodings are not straightforward although it is not hard. The issues described in section �2.7. Processing Rules
for v2.xml Messages� need to be taken into account when performing the translations.
11 Appendix (3)
11.1 Normative Appendix (3.1)
11.1.1 List of Messages With Equal Message Structures (3.1.1)
As previously mentioned, the v2.xml schemas are based on the message structure ID - a concept introduced in version
2.3.1. The standard documents contain tables with the message structure IDs.
Version
Chapter
HL7 Table
2.24.1.9
v2.4
2.17.3
v2.5
v2.5.1
2.17.3
v2.6
2.16.3
v2.7
https://www.hl7.eu/refactored/encoding02xml.html 18/31
6/15/24, 3:33 PM HL7 - REFACTORED
2.C.2.175
v2.7.1
????
v2.8
????
as shown by the following table. There is a set for each HL7 version supported by the v2.xml specification. In addition,
HTML files are provided, one for each message structure, containing a short description of the message and links to the
corresponding schemas (in directory xsd).
Please note that the use of XML schemas is recommended by HL7 for all normative specifications. The use of XML
schema ([rfXMLSchema], a W3C recommendation since May 2001) is recommended by HL7 for all normative
specifications. The schemas are not part of the normative specification, but rather added as an informative appendix in
order to support vendors with migration from DTDs to XML schemas.
It should be mentioned that DTDs can coexist in the same interface with schemas and not cause any issues. For example,
the sending interface can implement XML messages using schemas and the receiving system using DTDs. However,
schemas have a much greater expressiveness and should be preferred.
A set of many files in HTML format containing a short description of the message and links to the corresponding schemas.
A set of many schemas each containing the schema definitions for a specific message structure specified by
MessageStructureID, for example ADT_A01.xsd contains the definitions for the ADT A01 message structure,
ADT_A02.xsd for ADT A02 and so forth.
schema containing definition of batch messages (refer to section �2.7.4. Batch Messages�)
An XML instance of a specific message should refer to the corresponding schema. The following examples show a
schema reference within a v2.xml XML message instance fragment. In both cases is the root element of the instance.
https://www.hl7.eu/refactored/encoding02xml.html 19/31
6/15/24, 3:33 PM HL7 - REFACTORED
Expanding the FOO message by adding a local Z segment (with own field definitions not mentioned in detail here), let�s
say the new content model should be ABC, DEF, ZZZ, GHI, JKL. To achieve this, the entity of the content model
describing FOO can be changed in the internal subset like this:
A redefinition containing the new localized content of the FOO content model can then be made on a copy of the schema
definition.
Now, the FOO message is redefined by the local modification. The copy of the schema will be used instead of the original
version. A new root element can be defined called FOO.LOCAL that serves as a localized version of the original FOO root
element.
Please note that it is good practice to intersperse local stuff by using a different namespace. It is therefore recommended
to associate Z-stuff with another namespace.
https://www.hl7.eu/refactored/encoding02xml.html 20/31
6/15/24, 3:33 PM HL7 - REFACTORED
as XML elements. A field's data type is represented as a fixed attribute, while data type components are represented as
XML elements. Full SGML provides even greater minimization capacity with the use of SHORTTAG, OMITTAG, and
SHORTREF techniques, resulting in very small messages that are not valid XML, and are therefore not employed here.
The greater the percentage of data characters (as opposed to markup characters) in an average message, the less
important any additional overhead imposed by changing from the standard HL7 encoding rules to XML becomes. Data
from the Duke University Medical Center (DUMC) HL7 production environment suggests that on average, for standard
�vertical bar encoding�, data characters comprise about 70% of overall message length. (Data from DUMC courtesy of
Al Stone, and posted to the HL7 SGML/XML SIG List Server 1998-01-15 and 1998-01-16) The XML encoding
recommended here will result in messages that are approximately five to ten times longer, although this estimate has yet
to be subjected to rigorous testing nor is officially published.
Message length is an issue for bandwidth requirements but also for long term archiving the original messages (as done e.
g. by some healthcare providers). It should be mentioned that the use of compression is considered as a solution to deal
with both bandwidth and archiving issues. It�s a matter of fact, that using appropriate compression algorithms XML
instances compress very well. This is for example because starting and ending tags are almost the same sequence of
characters. However, describing compression methods is out of scope of this specification.
An example: the large messages shown in section �3.2.4. Algorithms� show 1,426 bytes for the vertical bar encoding
and 6,442 bytes for the v2.xml encoding (4.5 times larger). After compression the v2.xml message is 1,714 bytes long.
That is about 20% larger than the uncompressed vertical bar variant.
Furthermore it should be mentioned that we�ve learned from early v2.xml implementers that performance could be
gained (along with the use of less bandwidth) if large batch files are broken into many small batch files.
The v2.xml schemas are crafted to fulfill these requirements. Please refer to section �3.1.3. Localization of messages�
of the informative appendix for further information.
https://www.hl7.eu/refactored/encoding02xml.html 21/31
6/15/24, 3:33 PM HL7 - REFACTORED
Some HL7 rules are easy to explicitly represent within an XML schema, such as the optionality and repetition of a field
within a segment.
You can carry HL7-valid messages in the constructs defined by this specification, but you can also carry a lot of HL7-
invalid messages. An XML processor can't validate that a message received is a valid HL7 message. The decision in the
XML representation presented here is to capture as many HL7 business rules as reasonably possible in terms of XML
schemas. This includes enabling a validating parser to verify the optionality, repetition, and ordering of segments within
messages and fields within segments; and the correct use of data types and their components within fields. Easing the
burden on the application with regard to structural validity (e. g., are all the pieces in the proper place) is itself a big win,
despite the fact that the application will still have to perform semantic validation (e. g., is that code really a valid SNOMED
code or other business rules to be conformant to).
Some actions that are supported in vertical bar encoding, such as the forward-adoption of new data types cannot be
handled by the XML encoding.
The automatic creation process was considered in order to avoid handcrafting of the schemas, which would have involved
a certain danger of introducing errors. Furthermore, necessary refinement of definitions during the development process
could be achieved much easier.
For the second release, backward compatibility with the previous version of this specification should be guaranteed.
Field Name
version_id
long integer
version number
hl7_version
Text-8
https://www.hl7.eu/refactored/encoding02xml.html 22/31
6/15/24, 3:33 PM HL7 - REFACTORED
HL7 version
description
Text-80
Description
date_release
Date
The following HL7 Database table is used in the creation of the message schemas. (Only those fields being queried are
shown. The field names and their descriptions are taken verbatim from the HL7 Database.)
Field Name
event_code
Text-3
version_id
long integer
version number
seq_no
Integer
message_typ_snd
Text-3
message_typ_return
https://www.hl7.eu/refactored/encoding02xml.html 23/31
6/15/24, 3:33 PM HL7 - REFACTORED
Text-3
message_structure_snd
Text-7
message_structure_return
Text-7
Chapter
Text-10
Field Name
message_structure
Text-7
Message Structure ID
version_id
long integer
version number
https://www.hl7.eu/refactored/encoding02xml.html 24/31
6/15/24, 3:33 PM HL7 - REFACTORED
seq_no
Integer
consecutive increasing number used for each field within the segment
seg_code
Text-3
Segment-Code
groupname
Text-10
repetitional
Yes/No
Repetitional
Optional
Yes/No
Optional
*The field names and their descriptions are taken verbatim from the HL7 Database.
This resulting table is exported to messages.txt. This file serves as input for the transformation algorithms (see also
section �3.2.3. Options�).
Field Name
seg_code
Text-3
https://www.hl7.eu/refactored/encoding02xml.html 25/31
6/15/24, 3:33 PM HL7 - REFACTORED
version_id
long integer
version number
Description
Text-50
Interpretation
Text-50
Visible
Yes/no
Field Name
seg_code
Text-3
version_id
long integer
version number
https://www.hl7.eu/refactored/encoding02xml.html 26/31
6/15/24, 3:33 PM HL7 - REFACTORED
seq_no
Integer
data_item
Long Integer
Data Element ID
req_opt
Text-5
required/optional/backward compatibility
repetitional
Text-1
Repetitional
Repetitions
long integer
Number of repetitions
The following SQL query extracts data from tables HL7SegmentDataElements and HL7DataElements:
Field Name
data_item
https://www.hl7.eu/refactored/encoding02xml.html 27/31
6/15/24, 3:33 PM HL7 - REFACTORED
Long Integer
version_id
Long Integer
Version number
description
Text-78
data_structure
Text-20
Min_length
Long Integer
Max_length
Long Integer
Conf_length
Text-10
conformance length
table_id
Long Integer
ID assigned table
https://www.hl7.eu/refactored/encoding02xml.html 28/31
6/15/24, 3:33 PM HL7 - REFACTORED
The following SQL query extracts data from tables HL7SegmentDataElements and HL7DataElements: to create
Fields.xsd:
Field Name
data_structure
Text-20
version_id
long integer
version number
Description
Text-80
Description
Field Name
data_structure
Text
version_id
long integer
version number
https://www.hl7.eu/refactored/encoding02xml.html 29/31
6/15/24, 3:33 PM HL7 - REFACTORED
seq_no
long integer
comp_nr
long integer
table_id
long integer
Number of assigned table if different from component (overwrites table number of component)
Field Name
comp_nr
Long Integer
version_id
long integer
version number
description
Text-50
Description
table_id
Long Integer
https://www.hl7.eu/refactored/encoding02xml.html 30/31
6/15/24, 3:33 PM HL7 - REFACTORED
data_type_code
Text-3
Data type
A VB program generates the XML schema definitions and additional HTML files for further information. The structure of the
generated schemas follows from the design considerations described above. The algorithms instantiated in this VB
program are not described in detail here and are not part of this specification, but will be available with the HL7 database
and on the HL7 website.
© HL7.org 2023+. HL7 v2+ (HL7-DB #100) generated on Do, Mai 16, 2024.
https://www.hl7.eu/refactored/encoding02xml.html 31/31