Iso 25237-2017
Iso 25237-2017
STANDARD 25237
First edition
2017-01
Health informatics —
Pseudonymization
Informatique de santé — Pseudonymisation
Reference number
ISO 25237:2017(E)
© ISO 2017
ISO 25237:2017(E)
Contents Page
Foreword...........................................................................................................................................................................................................................................v
Introduction................................................................................................................................................................................................................................. vi
1 Scope.................................................................................................................................................................................................................................. 1
2 Normative references....................................................................................................................................................................................... 1
3 Terms and definitions...................................................................................................................................................................................... 1
4 Abbreviated terms............................................................................................................................................................................................... 6
5 Requirements for privacy protection of identities in healthcare....................................................................... 7
5.1 Objectives of privacy protection............................................................................................................................................... 7
5.2 General............................................................................................................................................................................................................ 7
5.3 De-identification as a process to reduce risk................................................................................................................. 8
5.3.1 General...................................................................................................................................................................................... 8
5.3.2 Pseudonymization........................................................................................................................................................... 8
5.3.3 Anonymization................................................................................................................................................................... 9
5.3.4 Direct and indirect identifiers............................................................................................................................... 9
5.4 Privacy protection of entities...................................................................................................................................................... 9
5.4.1 Personal data versus de-identified data....................................................................................................... 9
5.4.2 Concept of pseudonymization............................................................................................................................ 11
5.5 Real world pseudonymization................................................................................................................................................. 13
5.5.1 Rationale............................................................................................................................................................................... 13
5.5.2 Levels of assurance of privacy protection................................................................................................ 14
5.6 Categories of data subject............................................................................................................................................................ 16
5.6.1 General................................................................................................................................................................................... 16
5.6.2 Subject of care.................................................................................................................................................................. 16
5.6.3 Health professionals and organizations.................................................................................................... 16
5.6.4 Device data.......................................................................................................................................................................... 16
5.7 Classification data.............................................................................................................................................................................. 17
5.7.1 Payload data....................................................................................................................................................................... 17
5.7.2 Observational data....................................................................................................................................................... 17
5.7.3 Pseudonymized data.................................................................................................................................................. 17
5.7.4 Anonymized data........................................................................................................................................................... 17
5.8 Research data......................................................................................................................................................................................... 17
5.8.1 General................................................................................................................................................................................... 17
5.8.2 Generation of research data................................................................................................................................. 18
5.8.3 Secondary use of personal health information.................................................................................... 18
5.9 Identifying data.................................................................................................................................................................................... 18
5.9.1 General................................................................................................................................................................................... 18
5.9.2 Healthcare identifiers................................................................................................................................................ 18
5.10 Data of victims of violence and publicly known persons................................................................................. 19
5.10.1 General................................................................................................................................................................................... 19
5.10.2 Genetic information.................................................................................................................................................... 19
5.10.3 Trusted service................................................................................................................................................................ 19
5.10.4 Need for re-identification of pseudonymized data.......................................................................... 19
5.10.5 Pseudonymization service characteristics.............................................................................................. 20
6 Protecting privacy through pseudonymization..................................................................................................................20
6.1 Conceptual model of the problem areas......................................................................................................................... 20
6.2 Direct and indirect identifiability of personal information............................................................................ 21
6.2.1 General................................................................................................................................................................................... 21
6.2.2 Person identifying variables................................................................................................................................ 21
6.2.3 Aggregation variables................................................................................................................................................ 21
6.2.4 Outlier variables............................................................................................................................................................. 22
6.2.5 Structured data variables....................................................................................................................................... 22
6.2.6 Non-structured data variables........................................................................................................................... 23
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity assessment,
as well as information about ISO’s adherence to the World Trade Organization (WTO) principles in the
Technical Barriers to Trade (TBT) see the following URL: www.iso.org/iso/foreword.html.
The committee responsible for this document is ISO/TC 215, Health informatics.
Introduction
Pseudonymization is recognized as an important method for privacy protection of personal health
information. Such services may be used nationally, as well as for trans-border communication.
Application areas include, but are not limited to:
— indirect use of clinical data (e.g. research);
— clinical trials and post-marketing surveillance;
— pseudonymous care;
— patient identification systems;
— public health monitoring and assessment;
— confidential patient-safety reporting (e.g. adverse drug effects);
— comparative quality indicator reporting;
— peer review;
— consumer groups;
— field service.
This document provides a conceptual model of the problem areas, requirements for trustworthy
practices, and specifications to support the planning and implementation of pseudonymization services.
The specification of a general workflow, together with a policy for trustworthy operations, serve
both as a general guide for implementers but also for quality assurance purposes, assisting users of
the pseudonymization services to determine their trust in the services provided. This guide will serve
to educate organizations so they can perform pseudonymization services themselves with sufficient
proficiency to achieve the desired degree of quality and risk reduction.
1 Scope
This document contains principles and requirements for privacy protection using pseudonymization
services for the protection of personal health information. This document is applicable to organizations
who wish to undertake pseudonymization processes for themselves or to organizations who make a
claim of trustworthiness for operations engaged in pseudonymization services.
This document
— defines one basic concept for pseudonymization (see Clause 5),
— defines one basic methodology for pseudonymization services including organizational, as well as
technical aspects (see Clause 6),
— specifies a policy framework and minimal requirements for controlled re-identification (see
Clause 7),
— gives an overview of different use cases for pseudonymization that can be both reversible and
irreversible (see Annex A),
— gives a guide to risk assessment for re-identification (see Annex B),
— provides an example of a system that uses de-identification (see Annex C),
— provides informative requirements to an interoperability to pseudonymization services (see
Annex D), and
— specifies a policy framework and minimal requirements for trustworthy practices for the operations
of a pseudonymization service (see Annex E).
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 27799, Health informatics — Information security management in health using ISO/IEC 27002
3.2
anonymization
process by which personal data (3.37) is irreversibly altered in such a way that a data subject can no
longer be identified directly or indirectly, either by the data controller alone or in collaboration with
any other party
Note 1 to entry: The concept is absolute, and in practice, it may be difficult to obtain.
3.12
cryptographic algorithm
<cipher> method for the transformation of data (3.14) in order to hide its information content, prevent
its undetected modification and/or prevent its unauthorized use
3.13
cryptographic key management
key management
generation, storage, distribution, deletion, archiving and application of keys (3.31) in accordance with a
security policy (3.46)
[SOURCE: ISO 7498‑2:1989, 3.3.33]
3.14
data
reinterpretable representation of information (3.29) in a formalized manner suitable for communication,
interpretation or processing
Note 1 to entry: Data can be processed by humans or by automatic means.
3.22
disclosure
divulging of, or provision of access to, data (3.14)
Note 1 to entry: Whether the recipient actually looks at the data, takes them into knowledge or retains them, is
irrelevant to whether disclosure has occurred.
3.23
encryption
process of converting information (3.29) or data (3.14) into a cipher or code
3.24
healthcare identifier
subject of care identifier
identifier (3.27) of a person for primary use by a healthcare system
3.25
identifiable person
one who can be identified, directly or indirectly, in particular by reference to an identification number
or to one or more factors specific to his physical, physiological, mental, economic, cultural or social
identity
[SOURCE: Directive 95/46/EC]
3.26
identification
process of using claimed or observed attributes of an entity to single out the entity among other entities
in a set of identities
Note 1 to entry: The identification of an entity within a certain context enables another entity to distinguish
between the entities with which it interacts.
3.27
identifier
information (3.29) used to claim an identity, before a potential corroboration by a corresponding
authenticator
[SOURCE: ENV 13608-1:2000, 3.44]
3.28
indirectly identifying data
data (3.14) that can identify a single person only when used together with other indirectly
identifying data
Note 1 to entry: Indirect identifiers can reduce the population to which the person belongs, possibly down to one
if used in combination.
3.29
information
knowledge concerning objects that within a certain context has a particular meaning
[SOURCE: ISO/IEC 2382:2015, 2121271, modified.]
3.30
irreversibility
situation when, for any passage from identifiable to pseudonymous, it is computationally unfeasible to
trace back to the original identifier (3.27) from the pseudonym (3.43)
3.31
key
sequence of symbols which controls the operations of encryption (3.23) and decryption (3.19)
[SOURCE: ISO 7498‑2:1989, 3.3.32]
3.32
linkage of information objects
process allowing a logical association to be established between different information objects
3.33
longitudinal or lifetime personal health record
permanent, coordinated record of significant information, in chronological sequence
Note 1 to entry: It may include all historical data collected or be retrieved as a user designated synopsis of significant
demographic, genetic, clinical and environmental facts and events maintained within an automated system.
3.41
processor
natural or legal person, public authority, agency or any other body that processes personal data (3.37)
on behalf of the controller (3.10)
Note 1 to entry: See Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the
protection of individuals with regard to the processing of personal data and on the free movement of such data.
3.42
pseudonymization
particular type of de-identification (3.20) that both removes the association with a data subject (3.18)
and adds an association between a particular set of characteristics relating to the data subject and one
or more pseudonyms (3.43)
3.43
pseudonym
personal identifier (3.36) that is different from the normally used personal identifier and is used with
pseudonymized data to provide dataset coherence linking all the information about a subject, without
disclosing the real world person identity.
Note 1 to entry: This may be either derived from the normally used personal identifier in a reversible or
irreversible way or be totally unrelated.
Note 2 to entry: Pseudonym is usually restricted to mean an identifier that does not allow the direct derivation of
the normal personal identifier. Such pseudonymous information is thus functionally anonymous. A trusted third
party may be able to obtain the normal personal identifier from the pseudonym.
3.44
recipient
natural or legal person, public authority, agency or any other body to whom data (3.14) are disclosed
3.45
secondary use of personal data
uses and disclosures (3.22) that are different than the initial intended use for the data (3.14) collected
3.46
security policy
plan or course of action adopted for providing computer security
[SOURCE: ISO/IEC 2382:2015, 2126246]
3.47
trusted third party
security authority, or its agent, trusted by other entities with respect to security-related activities
[SOURCE: ISO/IEC 18014‑1:2008, 3.20]
4 Abbreviated terms
IP Internet Protocol
5.2 General
De-identification is the general term for any process of reducing the association between a set
of identifying data and the data subject with one or more intended use of the resulting data-set.
Pseudonymization is a subcategory of de-identification. The pseudonym is the means by which
pseudonymized data are linked to the same person or information systems without revealing
the identity of the person. De-identification inherently can limit the utility of the resulting data.
Pseudonymization can be performed with or without the possibility of re-identifying the subject of the
data (reversible or irreversible pseudonymization). There are several use case scenarios in healthcare
for pseudonymization with particular applicability in increasing electronic processing of patient data,
together with increasing patient expectations for privacy protection. Several examples of these are
provided in Annex A.
It is important to note that as long as there are any pseudonymized data, there is some risk of
unauthorized re-identification. This is not unlike encryption, in that brute force can crack encryption,
but the objective is to make it so difficult that the cost is prohibitive. There is less experience with de-
identification than encryption so the risks are not as well understood.
5.3.1 General
The de-identification process should consider the security and privacy controls that will manage the
resulting data-set. It is rare to lower the risk so much that the data-set needs no ongoing security
controls.
5.3.2 Pseudonymization
for a patient together, under a pseudonym. This also can be used to assure that each time data are
extracted into a de-identified set that new entries are also associated with the same pseudonym. In
pseudonymization, the algorithm used might be intentionally reversible or intentionally not-reversible.
A reversible scheme might be a secret lookup-table that where authorized can be used to discover the
original identity. In a non-reversible scheme, a temporary table might be used during the process, but is
destroyed when the process completes.
5.3.3 Anonymization
Anonymization is the process and set of tools used where no longitudinal consistency is needed.
The anonymization process is also used where pseudonymization has been used to address the
remaining data attributes. Anonymization utilizes tools like redaction, removal, blanking, substitution,
randomization, shifting, skewing, truncation, grouping, etc. Anonymization can lead to a reduced
possibility of linkage.
Each element allowed to pass should be justified. Each element should present the minimal risk, given
the intended use of the resulting data-set. Thus, where the intended use of the resulting data-set does
not require fine-grain codes, a grouping of codes might be used.
De-identification process addresses three kinds of data: direct identifiers, which by themselves identify
the patient; indirect identifiers, which provide correlation when used with other indirect or external
knowledge; and non-identifying data, the rest of the data.
Usually, a de-identification process is applied to a data-set, made up of entries that have many attributes.
For example, a spreadsheet made up of rows of data organized by column.
The de-identification process, including pseudonymization and anonymization, are applied to all the
data. Pseudonymization generally are used against direct identifiers, but might be used against indirect
identifiers, as appropriate to reduce risk while maintaining the longitudinal needs of the intended use
of the resulting data-set. Anonymization tools are used against all forms of data, as appropriate to
reduce risk.
According to Reference [18], “personal data” shall mean any information relating to an identified or
identifiable natural person (“data subject”); an identifiable person is one who can be identified, directly
or indirectly, in particular by reference to an identification number or to one or more factors specific to
his physical, physiological, mental, economic, cultural or social identity.
This concept is addressed in other national legislation with consideration for the same principles found
in this definition (e.g. HIPAA).
Key
1 set of data subjects
2 set of characteristics
This subclause describes an idealized concept of identification and de-identification. It is assumed that
there are no data outside the model as shown in Figure 2, for example, that may be linked with data
inside the model to achieve (indirect) identification of data subjects.
In 5.4.1, potential information sources outside the data model will be taken into account. This is
necessary in order to discuss re-identification risks. Information and communication technology
projects never picture data that are not used within the model when covering functional design
aspects. However, when focusing on identifiability, critics bring in information that could be obtained
by an attacker in order to identify data subjects or to gain more information on them (e.g. membership
of a group).
As depicted in Figure 1, a data subject has a number of characteristics (e.g. name, date of birth, medical
data) that are stored in a medical database and that are personal data of the data subject. A data
subject is identified within a set of data subjects if they can be singled out. That means that a set of
characteristics associated with the data subject can be found that uniquely identifies this data subject.
In some cases, only one single characteristic is sufficient to identify the data subject (e.g. if the number is
a unique national registration number). In other cases, more than one characteristic is needed to single
out a data subject, such as when the address is used of a family member living at the same address.
Some associations between characteristics and data subjects are more persistent in time (e.g. a date of
birth, location of birth) than others (e.g. an e-mail address).
Key
1 identifying data
2 payload data
3 personal data
4 set of characteristics
From a conceptual point of view, personal data can be split up into two parts according to identifiability
criteria (see Figure 3):
— payload data: the data part, containing characteristics that do not allow unique identification of
the data subject; conceptually, the payload contains anonymous data (e.g. clinical measurements,
machine measurements);
— identifying data: the identifying part that contains a set of characteristics that allow unique
identification of the data subject (e.g. demographic data).
Note that the conceptual distinction between “identifying data” and “payload data” can lead to
contradictions. This is the case when directly identifying data are considered “payload data”. Any
pseudonymization method should strive to reduce the level of directly identifying data, for example, by
aggregating these data into groups. In particular cases (e.g. date of birth of infants), where this is not
possible, the risk should be pointed out in the policy document. A following section of this document
deals with the splitting of the data into the payload part and the identifying part from a practical point
of view, rather than from a conceptual point of view. From a conceptual point of view, it is sufficient
that it is possible to obtain this division. It is important to note that the distinction between identifying
characteristics and payload are not absolute. Some data that is also identifying might be needed for the
research, e.g. year and month of birth. These distinctions are covered further on.
The practice and advancement of medicine require that elements of private medical records be released
for teaching, research, quality control and other purposes. For both scientific and privacy reasons,
these record elements need to be modified to conceal the identities of the subjects.
There is no single de-identification procedure that will meet the diverse needs of all the medical uses
while providing identity concealment. Every record release process shall be subject to risk analysis to
evaluate the following:
a) the purpose for the data release (e.g. analysis);
b) the minimum information that shall be released to meet that purpose;
c) what the disclosure risks will be (including re-identification);
d) the information classification (e.g. tagging or labelling);
e) what release strategies are available.
From this, the details of the release process and the risk analysis, a strategy of identification
concealment shall be determined. This determination shall be performed for each new release process,
although many different release processes may select a common release strategy and details. Most
teaching files will have common characteristics of purpose and minimum information content. Many
clinical drug trials will have a common strategy with varying details. De-identification meets more
needs than just confidentiality protection. There are often issues such as single-blinded and double-
blinded experimental procedures that also require de-identification to provide the blinding. This will
affect the decision on release procedures.
This subclause provides the terminology used for describing the concealment of identifying information.
Key
1 data subject
2 set of characteristics
Figure 4 — Anonymization
Anonymization (see Figure 4) is the process that removes the association between the identifying data
set and the data subject. This can be done in two different ways:
— by removing or transforming characteristics in the associated characteristics-data-set so that the
association is not unique anymore and relates to more than one data subject and no direct relation
to an individual remains;
— by increasing the population in the data subjects set so that the association between the data set
and the data subject is not unique anymore and no direct relation to an individual.
Key
1 pseudonym(s)
2 set of characteristics
Figure 5 — Pseudonymization
Pseudonymization (see Figure 5) removes the association with a data subject and adds an association
between a particular set of characteristics relating to the data subject and one or more pseudonyms.
From a functional point of view, pseudonymous data sets can be associated as the pseudonyms allow
associations between sets of characteristics, while disallowing association with the data subject. As
a result, it becomes possible, for example, to carry out longitudinal studies to build cases from real
patient data while protecting their identity.
In irreversible pseudonymization, the conceptual model does not contain a method to derive the
association between the data-subject and the set of characteristics from the pseudonym.
Key
1 data subject
2 pseudonyms
3 set of characteristics
4 a) derived from
5 b) derived from
In reversible pseudonymization (see Figure 6), the conceptual model includes a way of re-associating
the data-set with the data subject.
There are two methods to achieve this goal:
a) derivation from the payload; this could be achieved by, for instance, encrypting identifiable
information along with the payload;
b) derivation from the pseudonym or via a lookup-table.
Reversible pseudonymization can be established in several ways whereby it is understood that
the reversal of the pseudonymization should only be done by an authorized entity in controlled
circumstances. The policy framework regarding re-identification is described in Clause 7. Reversible
pseudonymization compared to irreversible pseudonymization typically requires increased protection
of the entity performing the pseudonymization.
Anonymized data differ from pseudonymized data as pseudonymized data contain a method to group
data together based on criteria that are derived from the personal data from which they were derived.
5.5.1 Rationale
5.4 depicts the conceptual approach to pseudonymize where concepts such as “associated”,
“identifiable”, “pseudonymous”, etc. are considered absolute. In practice, the risk for re-identification
of data sets is often difficult to assess. This subclause refines the concepts of pseudonymization and
unwanted/unintended identifiability. As a starting point, the European data privacy protection
directive is here referred to.
There are many regulations in many jurisdictions that require creation of de-identified data for various
purposes. There are also regulations that require protection of private information without specifying
the mechanisms to be used. These regulations generally use effort and difficulty related phrases,
which is appropriate given the rapidly changing degree of difficulty associated with de-identification
technologies.
Statements such as “all the means likely reasonable” and “by any other person” are still too vague. Since
the definition of “identifiable” and “pseudonymous” depend upon the undefined behaviour (“all the
means likely reasonable”) of undefined actors (“by any other person”), the conceptual model in this
document should include “reasonable” assumptions about “all the means” likely deployed by “any other
person” to associate characteristics with data subjects.
The conceptual model will be refined to reflect differences in identifiability and the conceptual model
will take into account “observational databases” and “attackers”.
5.5.2.1 General
Current definitions lack precision in the description of terms such as “pseudonymous” or “identifiable”.
It is unrealistic to assume that all imprecision in the terminology can be removed, because
pseudonymization is always a matter of statistics. But the level of the risk for unauthorized re-
identification can be estimated. The scheme for the classification of this risk should take into account the
likelihood of identifying the capability of data, as well as by a clear understanding of the entities in the
model and their relationship to each other. The risk model may, in some cases, be limited to minimizing
the risk of accidental exposure or to eliminate bias in situations of double-blinded studies, or the risks
may be extended to the potential for malicious attacks. The objective of this estimation shall be that
privacy policies, for instance, can shift the “boundaries of imprecision” and define within a concrete
context what is understood by “identifiability” and as a result, liabilities will be easier to assess.
A classification is provided below, but further refinement is required, especially since quantification
of re-identification risks requires the establishment of mathematical models. Running one record
through one algorithm no matter how good the algorithm still carries risks of being re-identifiable. A
critical step in the risk assessment process is the analysis of the resulting de-identified data set for any
static groups that may be used for re-identification. This is particularly important in cases where some
identifiers are needed for the intended use. This document does not specify such mathematical models;
however, informative references are provided in the Bibliography.
Instead of an idealized conceptual model that does not take into account data sources (known or
unknown) outside the data model, assumptions shall be made in the re-identification risk assessment
method on what data are available outside the model.
A real-life model should take into account, both directly and indirectly, identifying data. Each use case
shall be analysed to determine the information requirements for identifiers and to determine which
identifiers can be simply blanked, which can be blurred, which are needed with full integrity, and which
will need to be pseudonymized.
Three levels of the pseudonymization procedure, ensuring a certain level of privacy protection, are
specified. These assurance levels consider risks of re-identification based upon consideration of both
directly and indirectly identifying data. The assurance levels consider the following:
— level 1: the risks associated with the person identifying data elements;
— level 2: the risks associated with aggregating data variables;
— level 3: the risks associated with outliers in the populated database.
The re-identification risk assessment at all levels shall be established as a re-iterative process with
regular re-assessments (as defined in the privacy policies). As experience is gained and the risk model
is better understood, privacy protection and risk assessment levels should be reviewed.
Apart from regular re-assessments, reviews can also be triggered by events, such as a change in the
captured data or introduction of new observational data into the model.
When referring to the assurance levels, the basic denomination of the levels as 1, 2 and 3 could be
complemented by the number of revisions (e.g. level 2+ for a level 2 that has been revised; the latest
revision data should be mentioned and a history of incidents and revisions kept up-to-date). The
requested assurance level dictates what kind of technical and organizational safeguards need to be
implemented to protect the privacy of the subject of data. A low level of pseudonymization will require
more organizational measures to protect the privacy of data than will a high level of pseudonymization.
5.5.2.2 Assurance level 1 privacy protection: removal of clearly identifying data or easily obtainable
indirectly identifying data.
A first, intuitive level of anonymity can be achieved by applying rules of thumb. This method is
usually implicitly understood when pseudonymized data are discussed. In many contexts, especially
when only attackers with poor capabilities have to be considered, this first level of anonymity may
provide a sufficient guarantee. Identifiable data denotes that the information contained in the data
itself is sufficient in a given context to pinpoint an entity. Names of persons are a typical example. 6.2.1
provides specification of data elements that should be considered for removal or aggregation to assert
an anonymized data set.
5.5.2.3 Assurance level 2 privacy protection: considering attackers using external data.
The second level of privacy protection can be achieved when taking into account the global data model
and the data flows inside the model. When defining the procedures to achieve this level, a static risk
analysis that checks for re-identification vulnerabilities by different actors should be performed.
Additionally, the presence of attackers who combine external data with the pseudonymized data to
identify specific data sets should be considered. The available external data may depend on the legal
situation in different countries and on the specific knowledge of the attacker. As an example, the
required procedures may include the removal of absolute time references. A reference time marker “T”
is defined as, for example, the admission of a patient for an episode of care and other events. Discharge
is expressed with reference to this time marker. An attacker is an entity that gathers data (authorized
or unauthorized) with the aim of attempting to attribute to data subjects, the gathered data in an
unauthorized way and thus obtain information to which he is not entitled. From a risk analysis point of
view, data gathered and used by an attacker are called “observational data”.
Note that the disallowed or undesired activity by the attacker is not necessarily the gathering of the
data, rather the attempt to attribute the data to a data subject and consequently gain information about
a data subject in an unauthorized way.
A risk analysis model may include assumptions about attacks and attackers. For example, in some
countries, it may be possible to legally obtain discharge data by entities that are not implicitly involved
in the care or associated administration of patients. The risk analysis model may take into account the
likeliness of the availability of specific data sets.
From a conceptual point of view, an attacker brings data elements into the model that in the ideal world
would not exist.
A policy document should contain an assessment of the possibility of attacks in the given context.
The re-identification risk can be seriously influenced by the data itself, for example, by the presence of
outliers or rare data. Outliers or rare data can indirectly lead to identification of a data subject. Outliers
do not necessarily consist of medical data. For instance, if, on a specific day, only one patient with a
specific pathology has visited a clinic, then observational data on who has visited the clinic that day can
indirectly lead to identification.
When assessing a pseudonymization procedure, just a static model-based risk analysis cannot quantify
the vulnerability due to the content of databases; therefore, running regular risk analyses on populated
models is required to provide a higher level of anonymity.
5.6.1 General
Decisions to protect the identity of the subject of care may be associated with the following:
— legal requirements for privacy protection;
— trust relationships between the health professional and the subject of care associated with medical
secrecy principles;
— responsible handling of sensitive disease registries and other public health information resources;
— provision of minimum necessary disclosures of identifiers in the provision of care (e.g. laboratory
testing);
— privacy protection to enable indirect use of clinical data for research purposes. Be aware that
in some jurisdictions (e.g. in Germany), the indirect use of subject of care data require informed
consent when the data are only pseudonymized and not fully anonymized.
Continuity of care requires uniform identification of patients and the ability to link information across
different domains. Where data are pseudonymized in the context of clinical care, there is a risk to
misidentification or missed linkages of the subject of care across multiple domains. In cases where
pseudonymization is applied in a direct care environment, consideration shall be given to patient
consent for those cases where the patient does not want pseudonymization for safety purposes.
Pseudonymization may also be used to protect the identity of health professionals for a number of
purposes including the following:
— peer review;
— reporting of medical mishaps or adverse drug events;
— care process analysis;
— business analysis;
— physician profiling.
Such protections are subject to local jurisdiction legal requirements, which may be distinct from
protection requirements of organization identities.
In healthcare, the security of devices, in support of the confidentiality of patient data is required for
privacy protection. For patients, a consideration involves the consideration of implanted medical
devices. Identifiable data on the device can be directly associable to the patient as can other medical
and personal devices (e.g. respiratory assistive devices). As such, device identity or device data may be
used to identify a person. Healthcare devices assigned to a healthcare professional or employee shall
also be considered in identification risk assessment as it can identify the provider or organization, and
hence, the patient.
According to the paradigm followed in this document, it should be possible to split data into data that
can lead to identification and data that carry the medical information of interest. This assessment is
fully dependent on the level of privacy protection that is targeted.
In healthcare, the security of devices, in support of the confidentiality of patient data, is required
for privacy protection. For patients, a consideration involves the consideration of implanted medical
devices. Identifiable data on the device can be directly associable to the patient as can other medical
and personal devices (e.g. respiratory assistive devices). As such, a device and its data may be able to
identify a person. Healthcare devices assigned to a healthcare professional or employee shall also be
considered in identification risk assessment as it can identify the provider or organization, and hence,
the patient. Observational data which are gathered and used by an attacker reflects various properties
of data-subjects recorded with the aim of describing the data subjects as completely as possible with
the intent of re-identifying or identifying membership in certain classifications at a later stage.
Anonymized data are data that do not contain information that can be used to link it with the data
subject with whom the data are associated. Such linkage could, for instance, be obtained through
names, date of birth, registration numbers or other identifying information.
5.8.1 General
Using health data for research is usually a secondary use of health data after/beside the primary
use that is for patient treatment. In many jurisdictions, this may require the informed consent of the
patient. It is a fundamental principle of data protection that identifiable personal data should only be
processed as far as is necessary for the purpose at hand. There is a clear interest for organizations
performing research to pseudonymize or even anonymize data, where possible. Concerns for privacy of
individuals, particularly in the area of health information, triggered the development of new regulatory
requirements to assure privacy rights. Researchers will need to comply with these rulings and in many
cases, modify traditional methods for sharing individually identifiable health information.
Medical privacy and patient autonomy are crucial, but many traditional approaches to protection
are not easily scalable to the increasing complexity of data, information flows and opportunities for
enhanced value merged information sets. Classic informed consent for each data use may be difficult
or impossible to obtain. For anonymized data, however, research may proceed without the data subject
being affected or involved but not with pseudonymized data.
Trends and opportunities to accumulate, merge and reuse health information collected and gathered for
secondary use (e.g. research) will continue to expand. Privacy enhancing technologies are well-suited
to address the security and confidentiality implications surrounding this growth. Many important data
applications do not require direct processing of identifiable personal information. Valuable analysis can
be carried out on data without ever needing to know the identity of the actual individuals concerned.
Pseudonymization may be used in the generation of research data. In this case, there is optimal
opportunity to assess risks to privacy inherent in the research study and to mitigate these risks
through anonymization techniques described in this document. Uses for research also more clearly
facilitate consent and definition of rules surrounding circumstances and reasons for intentional re-
identification needs.
Where permitted by jurisdiction, pseudonymization may be used to protect the privacy of individuals
whose personal health information is to be used for secondary use. Secondary uses are those that
are different than the initial intended use for the data collected. Each secondary use shall undergo a
privacy threat assessment and define mitigations to the identified risks. Assumptions shall not be made
as to the sufficiency of an existing risk assessment and risk mitigation to extend the data resource to
additional secondary use.
5.9.1 General
Data that contains information that allow unique identification of the data subject (e.g. demographic data).
outside the scope of this document. However, at the core of some identity, management solutions will be
a pseudonymization solution.
5.10.1 General
Victims of violence, who are diagnosed or treated, often require extra shielding by hospital personnel
as long as their identification poses specific threats. Caregivers in direct contact with the patient can
identify the person but back-office personnel cannot.
Similar issues often arise when publicly well-known persons or persons otherwise known to the
healthcare community, often wrongly denoted as “VIPs”, are admitted (e.g. politicians, captains of
industry, etc.).
There is no general consensus regarding genetic information and there are a variety of requirements
based on the legal jurisdiction. See Annex F for further considerations.
In the case where the pseudonymization service is required to synchronize pseudonyms across
multiple entities or enterprises, a trusted service provider may be employed. Trusted services may be
implemented through numerous options, including commercial entities, membership organizations or
government entities. Providers of trusted services may be governed through legislation or certification
requirements in various jurisdictions.
Pseudonymization separates out personally identifying data from payload data by assigning a coded
value to the sensitive data before splitting the data out. The reversible approach maintains a connection
between payload data and personal identifiers, but can allow for re-identification under prescribed
circumstances and protections. The irreversible approach does not maintain any connection between
payload data and personal identifiers and consequently no re-identification is applicable.
This approach serves researchers well in that it provides a means of cleansing research data while
retaining the ability to reference source identifiers for the many (controlled) circumstances under which
such information may be needed. Such circumstances include the following coded values. This document
defines a vocabulary. The vocabulary identification is: ISO (1) standard (0) pseudonymization (25237)
re-identification purpose (1). The codes in this vocabulary are as follows:
a) data integrity verification/validation;
b) data duplicate record verification/validation;
c) request for additional data;
d) link to supplemental information variables;
e) compliance audit;
f) communicate significant findings;
g) follow-up research.
These values should be leveraged for audit purposes when facilitating authorized re-identification.
Such re-identification methods shall be well-secured, and can be done through the use of a trusted
service for the generation and management of the decoding keys. The criteria for re-identification can
be defined, automated and securely managed using the trusted services.
6.2.1 General
Personal data may be directly identifiable or indirectly identifiable. The data are considered directly
identifiable in those cases where an individual can be identified through a data attribute or through
linkage by that attribute to a publicly accessible resource or resource restricted access under an
alternative policy that contains the identity. This would include cross reference with well-known
identifiers (e.g. telephone number, address) or numeric identifiers (e.g. order numbers, study numbers,
document OIDs, laboratory result numbers). An indirect identifier is an attribute that may be used in
combination with indirectly identifying attributes to uniquely identify the individual (e.g. postal code,
gender, date of birth). This would also include protected indirect identifiers (e.g. procedure date, image
date) which may have more restricted access, but can be used to identify the patient.
Demographic data can be both direct and indirect identifiers and should be removed where possible, or
aggregated at a threshold specified by the domain or jurisdiction. Where these data need to be retained,
risk assessment of unauthorized re-identification and appropriate mitigations to identified risks of the
resulting data resource shall be conducted. These demographic data include the following:
— language spoken at home;
— person’s communication language;
— religion;
— ethnicity;
— person gender;
— country of birth;
— occupation;
— criminal history;
— person legal orders;
— other addresses (e.g. business address, temporary addresses, mailing addresses);
— birth plurality (second or later delivery from a multiple gestation).
A policy document shall be generated containing an assessment of the possibility of attacks in the given
context as a risk assessment against level 2 privacy protection. The identified risks shall be coupled
with a risk mitigation strategy.
Structured data give some indication of what information can be expected and where it can be
expected. It is then up to re-identification risk analysis to make assumptions about what can lead to
(unacceptable) identification risks, ranging from simple rules of thumb up to analysis of populated
databases and inference deductions. In “free text”, as opposed to “structured”, automated analysis for
privacy purposes with guaranteed outcome is not possible.
6.2.6.1 General
In the case of non-structured data variables, the pseudonymization decision of data separation into
identifying and payload data remains the central issue. Freeform text shall be considered suspect and
thus should be considered for removal. Non-structured data variables shall be subject to the following:
— single out what according to the privacy policy (and desired level of privacy protection) is identifiable
information;
— delete data that is not needed;
— policies should state that the free text part shall not contain directly identifiable information.
Keep together as payload what is considered to be non-identifiable according to the policy.
Freeform text cannot be assured anonymity with current pseudonymization approaches. All freeform
text shall be subject to risk analysis and a mitigation strategy for identified risks. Re-identification risks
of retained freeform text may be mitigated through the following:
— implementation of policy surrounding freeform text content requiring that the freeform text data
shall not contain directly identifiable information (e.g. patient numbers, names);
— verification that freeform content is unlikely to contain identifying data (e.g. where freeform text is
generated from structured text);
— revising, rewriting or otherwise converting the data into coded form.
As parsing and natural language processing “data scrubbing” and pseudonymization algorithms
progress, re-identification risks associated with freeform text may merit relaxation of this assertion.
Freeform text should be revised, rewritten or otherwise converted into coded form.
As with freeform text, non-parseable data, such as voice fields, should be removed.
Some medical data contain identifiable information within the data (e.g. a radiology image with
patient identifiers on image). Mitigations of such identifiable data in the structured and coded DICOM
header should be in accordance with DICOM PS3.15:2016, Annex E. DICOM (ISO 12052) has defined
recommended de-identification processes for DICOM SOP Instances (documents) for some common
situations. It defines a list of different de-identification algorithms that might be applied. Then it
identifies some common use situations and characteristics, e.g. “need to retain device identification”.
For each standard DICOM attribute (data element), it then recommends the algorithm that is most likely
appropriate for that attribute in that situation.
These assignments are expected to be adjusted when appropriate, but providing a starting point for
typical situations greatly reduces the work involved in defining a de-identification process. Additional
risk assessment shall be considered for identifiable characteristics of the image or notations that are
part of the image.
It should be recognized that pseudonymization cannot fully protect data as it does not fully address
inference attacks. Pseudonymization and anonymization services shall supplement practices with risk
assessment, risk mitigation strategies and consent policies or other data analysis/pre-processing/post-
processing. The custodian of pseudonymized repositories shall be responsible for reviewing data
repositories for inference risk and to protect against disclosure of single record results. The information
source shall be responsible for pre-viewing/pre-processing the source data disclosed to protect
the disclosed data from inference based upon outliers, embedded identifiable data or other such
unintentional disclosures. For more details on how to conduct an inference risk assessment, see
Annex B.
There is always the risk that pseudonymized data can be linked to the data subject. In light of this risk,
the gathered data should be considered “personal data” and should be used only for the purposes for
which it was collected. In many countries, legislation requires protection of pseudonymized data in the
same manner as identifying data.
7 Re-identification process
7.1 General
Two distinct contexts of re-identification of pseudonymized information shall be considered:
— re-identification as part of the normal processing;
— re-identification as an exceptional event.
7.3 Exception
When re-identification is an exception to the standard way of data processing, the re-identification
process shall require
a) specific authentication procedures, and
b) exceptional interventions by the pseudonymization service provider.
When re-identification of de-identified data is considered the exception to the rule, the security policy
shall describe the circumstances that can lead to re-identification.
The data processing security policy document should define the cases that can be foreseen and should
cover the following.
— Each case should be described and one or more scenarios for re-identification per case should be
described.
Annex A
(informative)
A.1 General
This annex presents a series of high-level healthcare cases or “scenarios” representing core business
and technical requirements for pseudonymization services that will support a broad cross-section of
the healthcare industry.
General requirements are presented first, speaking of basic privacy and security principles and
fundamental needs of the healthcare industry. The document then details each scenario as follows:
a) a description of the scenario or healthcare situation requiring healthcare pseudonymization
services;
b) resulting business and technical requirements that a pseudonymization service shall provide.
Workflow/events/actions
a) Submit order to health information system (HIS):
1) the placer of the order authenticates towards the HIS;
2) the placer of the order submits the order with the hospital unique ID number of the data subject
to the HIS;
3) the placer of the order checks order against policies (e.g. recipient not allowed to receive
identifiable data, VIP, …) and decides on privacy protection measures;
b) Pseudonymize:
1) the hospital information system invokes the pseudonymization service with, as input, the
hospital unique ID number;
2) the PS processes the hosp ids;
3) the PS returns the pseudonym to the HIS;
c) The HIS sends the order with the pseudonym to the filler:
1) establish communication;
2) message sent;
3) acknowledgement received;
d) The order is processed by the filler of the order using the pseudonym:
1) (possible comparative analysis performed by specialist);
e) The filler of the order submits the result to the HIS with the pseudonym:
1) establish;
2) message sent;
3) acknowledgement received;
f) Re-identify result:
1) the HIS submits the pseudonym to the pseudonymization services;
2) authenticated user (HIS) is verified against reverse ID policy;
3) the PS processes the pseudonym;
4) the PS sends the real ID to the HIS;
g) The HIS inserts the result with the hospital ID into the HCR.
Other examples/remarks
Online counselling services over the web (care provided to an individual) same individual time after time.
A person well-known to the public presents them self to a healthcare provider for clinical care. Wanting
to assure that the episode of care and follow-up treatment remain confidential, the patient requests
pseudonymized identifiers be used across the encounters.
A.3.2.1 General
The clinical trials encompass a very wide range of situations. The clinical trials of drugs to gather
data for submission to the FDA are subject to many procedural regulations. There are also trials of
new equipment, e.g. ROC studies and trials of new procedures. The pseudonymization requirements
are driven by more than just privacy regulations. For scientific reasons, there can be a need for
pseudonymization of purely internal data in order to provide a suitable double blind analysis
environment.
Figure A.1 indicates the various locations where the data might be modified to add clinical trial
identification attributes (CTI) and/or remove attributes for pseudonymization.
Unlike the teaching files, there are usually multiple parties involved in the clinical trial process.
a) The clinical trial sponsor, who establishes the scientific requirements for the trial. This usually
establishes the kinds of data that should be preserved, data that should be blinded and data that
should be removed for scientific analysis purposes.
b) The clinical trial coordinating centre, which coordinates, gathers, and prepares the data. This
centre may also provide the pseudonymization of data, depending upon the procedures chosen and
the agreements made with the clinical trial sites.
c) Multiple clinical trial sites, where the actual clinical activity takes place. They pseudonymize the
data in accordance with both their privacy policies and the needs of the clinical trial sponsor and in
cooperation with the clinical trial coordinating centre.
d) Other reviewers, e.g. the FDA, who review the results of the clinical trial.
The trials may need reversibility so that actual patients can be notified of findings that are important
to the patient’s treatment. This can be implemented in various ways. The reviewer who makes the
finds needs to be able to report to someone (e.g. the clinical trial agent) that “patient X in clinical trial Y
should be notified of the finding ...”
It is very difficult to make any specific statement in advance about what must be blinded or how. The range
of topics that might be under investigation is very wide, and information about those topics often cannot
be blinded. Each clinical trial needs to establish its own blinding and pseudonymization rules, although
the work involved in doing this may be reduced by starting with the rules for similar previous trials.
There are some unique regulatory concerns with data gathering for some clinical trials. These require
complete audit trails and documentation of all data modifications. This includes modifications made for
de-identification purposes. These regulatory requirements are a significant factor in the selection of
de-identification techniques. Figure A.2 shows the use case diagram (ud) clinical trial flow.
Scenarios taken as example (in this group)
Submit data to clinical trials. This scenario describes the single source data collection for clinical care
and clinical research study data resources.
Actors: System user (e.g. investigator, member of the care team), investigator health information
system, clinical study information resource, care provider health information system (HIS).
Pre-conditions: Patient is in a clinical trial, investigator has data collection system that meets the
needs of both the clinical trial and the external information resource, local information system
available, patient consent obtained, a step of the clinical trial is concluded.
Post-conditions: Clinical study information resource has all relevant data from any patient encounter
for a patient participating in the study. External information resource (e.g. HIS, EHR) has all relevant
data from any patient encounter.
Workflow/events/actions
The process flow is as follows.
— Member of the care team authenticates to the HC system.
— Health information system initiates audit trail using consistent time.
— Clinician enters data into data collection system.
— System anonymizes or pseudonymizes data.
— System transmits relevant data to clinical study information resource.
— Clinical study information resource receives data.
— Data collection HIS posts clinical care information to external information resource and clinical
trial investigator reviews and verifies (via eSignature or some verification mechanism) that these
data accurately reflect the source data required for the trial.
that meets the needs of both the research and the local health information system; local information
system available; patient consent obtained as required by local jurisdiction; a step of patient encounter
is concluded.
Post-conditions: Research information resource has all relevant pseudonymized and privacy protected
data from any patient encounter from patients within the cohort population. External information
resource (e.g. HIS, EHR) has all relevant data from any patient encounter.
Workflow/events/actions
The process flow is as follows.
— Member of the care team authenticates to the HC system.
— Health information system initiates audit trail using consistent time.
— Clinician enters data into data collection system.
— System generates aggregate variables for privacy protection.
— System checks for uniquely identifiable characteristics in the data (e.g. rare diagnoses) or combined
data variables.
— System anonymizes or pseudonymizes data.
— System transmits relevant data to research information resource.
— Research information resource receives data.
— Data collection HIS posts clinical care information to local HIS.
Other examples/remarks
Generation of teaching data:
Comparative quality indicator reporting: Encounter and discharge data are submitted by healthcare
providers to a research database. Patient identifiers are pseudonymized through a pseudonymization
service, as are identifiable grouping and risk adjustment data. Appropriate aggregations such as length
of stay information are applied to further protect the research database from inference attacks. Provider
identities are pseudonymized to protect the identity of practitioners and healthcare organizations.
Peer review: A new surgery technique is developed. Physicians use a pseudonymization service to
submit case reports and adverse events to a common registry. This peer review registry is used to assess
trends and compare experiences across multiple case mixes and co-morbidities. The confidentiality
of the patients and practitioners are protected through the pseudonymization services provided by a
pseudonymization service. This enables the patient data to be tracked across these providers to assess
the full episode of care.
In assessing the cases in the study, it is found that a patient, having sought treatment from multiple
providers, is at risk for a complication of the surgery. A case is made for re-identification to be able to
contact the patient for follow-up assessment and treatment.
Actors: System user (e.g. public health official, member of the care team), public health information
system, clinical information resource public health information resource.
Pre-conditions: Filter mechanisms criteria for data exchange have been established, event detection
algorithms have been defined, patient is correctly identified, provider/information source is correctly
identified, pseudonymization, de-identification and re-identification services are available.
Post-conditions: Data are submitted from multiple clinical information resources to the public health
information system, data are received by the public health information system, the public health
information system supports functions relevant to the public health event detection, i.e. the public
health information system monitors, analyses, detects, investigates, notifies, alerts, reports and
communicates data related to a public’s health threat.
Workflow/events/actions
The process flow is as follows.
— Populate public health information system.
— Clinical information resource supports entry of patient visit data into EMR.
— Clinical information resource’s EMR supports the public health information system data needs.
— Clinical information resource initiates audit trail using consistent time.
— Clinical information resource reviews and verifies (via eSignature or other verification
mechanism) that these data accurately reflect the source data.
— Clinical information resource selects information to submit (transmit) to the public health
information system based upon filter criteria.
— Clinical information resource invokes service to pseudonymize data.
— Clinical information resource provides (transmits) relevant data to the public health information
system through secured messaging and transmission.
— Clinical information resource receives acknowledgement of receipt from the public health
information system.
— Support detection of a public health threat event.
— Provider receives notification from the public health information system of a suspected pattern
through secure electronic means and via telephone.
— Clinical information resource provides additional data to the public health information system
as needed.
— Provider receives health alert regarding the detected event through secure electronic means
and via telephone.
— Clinical information resource receives case-specific alert notifications from the public health
information system for any pertinent patient follow-up.
— Support on-going monitoring of the event.
— Clinical information resource captures and provides additional outbreak management data to the
public health information system, particularly new and early diagnosed cases or suspect cases.
— Authenticated clinical information resource invokes re-identification of patient identifiers
through pseudonymization service to notify and provide follow-up treatment to patient and to
request further screening of patient family/contacts as determined by outbreak management
protocols.
— Clinical information resource transmits daily data on utilization of resources to the public
health information system.
— Clinical information resource receives updates regarding the outbreak from the public health
information system through secure electronic means.
— Support rapid response management of the event.
— Clinical information resource receives recommendations/orders to conduct response-
related activities in accordance with outbreak management protocols from the public health
information system through secure electronic means.
— Clinical information resource sends acknowledgement of receipt of recommendations/orders
to conduct response-related activities in accordance with outbreak management protocols
from the biosurveillance information system through secure electronic means.
Other examples/remarks
Once a week, general physician systems send influenza and allergy data to a central national repository.
Before it reaches the repository, patient and physician identities are pseudonymized through a
pseudonymization service, and the location information of the patient is aggregated into a larger area.
The central repository is used for influenza and allergy alerts and has no need for identifiable data.
Other examples/remarks
A voluntary reporting system is used to generate a database in support of patient safety.
Pseudonymization is used to protect the identity of both the patient and the provider submitting
the data through the use of a pseudonymization service. Follow-up communications and requests
for additional details on the submitted events are facilitated through the pseudonymization service
without risk of identification of the patient or the provider.
A.3.8.1 General
Classroom teaching files are acquired by selecting interesting cases of real patients and then modifying
the records to remove identifying and extraneous information. These can be generated using an
anonymization process, but for living patients, there may be a need to update the records with new
information at a later date. The data should be pseudonymized in order to preserve the relationships
between the real patient and the teaching file so that these updates can be added to the teaching file.
The teaching files may be made available only to students at the generating facility, or they may be
published for use by students around the world. In the former case, the rules for pseudonymization may
permit greater detail, but for the latter use, the published medical records should be pruned to only the
essentials for the tutorial purpose.
A more restricted but very common need is the creation of personal teaching files by students to
capture cases that they found personally interesting. Privacy regulations prohibit the students from
taking copies of those records.
The key to pseudonymization is an assignment of name and patient ID for tutorial purposes. The typical
phrasing in a tutorial report is something like “Mr. Smith is a 50-year old male with a history of ....” The
pseudonymization should go through all of the medical records changing the name to Smith, assigning a
new birth date (consistent with the relevant age range), and removing all other identifying information
that is irrelevant to the tutorial purposes. The resulting new medical records can then be published as
a teaching file.
The generation of teaching files often requires creation of a secure database to maintain the relationship
between the actual patient identity and the pseudonymous identity. The medical records are often
on multiple systems, so the database needs to be either accessible or movable between the multiple
systems. It also needs to be in a format that is understood by a variety of systems.
The generation of pseudonymous data will have both local rules, for such things as generation of
pseudonymous IDs, and generic rules for such things as generation of blinded dates.
A clinician will need to establish the rules for what data attributes should be preserved, pseudonymized
or removed. These rules need to be consistently applied by multiple systems at multiple times in the
generation of the pseudonymous records.
Annex B
(informative)
B.1 General
The development of a method for privacy impact assessment is outside the scope of this document.
As a result of a privacy impact assessment, you often have a confidentiality risk analysis. However,
the following subclauses are intended to increase the awareness of those who will have to engage in
privacy impact assessment. From a generic presentation of the issues, it should be possible to derive a
number of requirements for privacy impact assessment design.
This document contains a model that takes into consideration three assurance levels. The levels have
been chosen as a function of the complexity of re-identification of the data.
This document can, however, formulate a number of requirements that the re-identification risk
assessment method should take into account for its design.
The common criteria (see ISO/IEC 15408-2) contain an informative annex on privacy (FPR). This can be
used as a starting point but is more focused on the usage of resources.
The target of evaluation (TOE) security functions contain specifications that may be usable but do not
incorporate the notion of levels of anonymity.
The remainder of this annex gives an overview of the risk assessment factors. The following text is
adapted from Deliverable D2.1 with permission from the authors[37].
A key element in privacy risk assessment is to assess the effect of observational data that can be
obtained by an attacker. Observational data can consist of events recorded by the attacker, but can also
consist of information that can be legally obtained by the attacker. It could be that the attacker is a
generic user of the system who has, either by accident or unauthorized effort, obtained extra data with
which he should not have come into contact in the normal line of his duty.
It is important to note that this information is usually outside the scope of the data model of an
application. Assumptions about observational data should however be made in order to assess the
privacy of data contained in the system.
In order to create a methodology for privacy risk assessment, a formalized way of describing privacy
threat and the risk of re-identification is needed.
A generic model of re-identification attacks, shown in its highest level of abstraction in Figure B.1,
consists of three major entities.
Although it might look simple and straightforward in that form, the model is refined in this clause
to such a level of complexity that it encompasses all real-life aspects of re-identification and privacy
protection.
model for modern public key cryptography, for example, is based on the fact that the attacker only has a
limited amount of money (computing power) at his disposal.
Basically, there are two characteristics that define the threat level of an attacker as shown in Figure B.2.
There is the goal of the attack (what information is an attacker after?) and there are the means at his
disposal. The latter is linked with the “value” that the information that could be recovered from the
anonymous database has for the attacker. It is clear that if the sensitive information enclosed in the
anonymous data can lead to large gain for an attacker (e.g. medical records for an insurance company),
he will be prepared to invest more into the re-identification process.
Next to the level of determination of an attacker, it is important to include his “goals” into the threat
model. Privacy protection is about protecting personal information and not simply about protecting the
identity linked to a specific database record. This subtle difference is reflected in the three different
attacker goals that are specified in the model. They are the following:
a) re-identification (full):
1) identify to whom a specific anonymous record belongs;
2) identify which anonymous record belongs to a certain person;
b) information recovery (or partial re-identification);
c) database membership:
1) Is someone listed in the database?
2) Is someone not listed in the database?
Full re-identification as a concept is well known. It is trying to (partially) convert an anonymous
database to its identified equivalent. In the most general case, an attacker will try to re-identify a
complete database. In practice, however, this is rarely the case, and an attacker will either want to find
out to whom a specific, interesting anonymous record belongs (e.g. find out to whom a record, listing
high income, belongs), or will want to retrieve all information about a specific person (e.g. an insurance
broker trying to figure out if someone has a heart condition).
The two other goals (partial re-identification and database membership) are not often discussed,
because analysis theory is quite complex. An attacker does not necessarily need to re-identify complete
records to obtain the needed information. Sometimes it is sufficient for an attacker to recover only a
single characteristic of a person from the anonymous database without ever knowing which records
belong to that person.
Finally, in some situations, the target information is not listed within the database itself, but within
the mere membership of a database. Being a member of a database can lead to private (sensitive)
information, for example, HIV patient databases. Therefore, the goal of an attacker could be to merely
determine if the people on his list are also in the anonymous database or not; he does not need to re-
identify each anonymous record for that.
It is clear that the methods used by an attacker will depend on the goals he is trying to achieve. These
attack strategies are closely related and although they fit within the same model, they differ strong
enough to be elaborated separately at some points in the text.
Full and partial re-identification as defined in this document is obviously closely related. Partial re-
identification is an intermediate stage between recovery of all information (on a particular subject)
within the anonymous database and no recovery at all (see Figure B.3). In other words, it is the situation
in which full re-identification fails, but in which the processes (algorithms) of re-identification used still
succeed in recovering some information from the anonymous database.
For all forms of re-identification, the attacker will mainly follow the same procedure. Based on his
observations and the content of the anonymous database, he will list for each identifier (nom-ID) in the
observations database, the anonymous identifiers (anon-ID) that could correspond with it.
The link between observations data and anonymous data can be made in numerous ways and will be
situation-specific (see Figure B.4). It is, however, important to add some level of classification in the
overall model, in order to understand the underlying mechanisms better.
Figure B.4 shows that the link between nom-ID and anon-ID can be made directly through the variables
listed in respective databases [linkage (1)], or through an intermediate step [linkage (2)]. In the first
case, the data listed in the anonymous database corresponds directly with the observed data. This
means that some of the variables in the anonymous database are observable by the attacker as a result
of the database linkage. Through these shared variables, the attacker can determine if an anonymous
record could correspond with an identifiable observations record.
In the second case, an intermediate step has to be taken in order to be able to link the two information
sources. The observations are not listed literally in the anonymous database, but can be inferred from
the variables present in the database. Note that for the reasoning in this document, this situation is
equivalent with inferring from the observed database, to link with anonymous data.
The implementation of the linkage and inference algorithms themselves is usually data- and application-
specific. However, several algorithms are able to deal with general data-types. At a higher abstraction
level however, apart from the actual implementation, there is the important aspect of the “certainty”
attached to a constructed link.
Both the linkage and inference algorithm(s) are not necessarily based on pure facts. A link made by an
attacker based on observations is not necessarily a correct one. Depending on the assumptions that an
attacker has made, the completeness and certainty of his observations, the complexity and fuzziness of
the anonymous data, certain links will be more likely than others. Therefore, the attacker could need to
associate a probability to the identifier link (off course, certainty gets a probability of 1).
Examples can also be easily understood. Imagine that the anonymous database only lists salaries and
that an attacker cannot get any direct salary information. He could, however, try to infer salary from
observable variables such as job function, size of house, type of car. Clearly an attacker can never be
sure, his assumptions are only true with a certain probability.
Key
O observational database
A anonymous database
x data subjects
When there is no unique link found between a direct identifier and an anonymous identifier, this does
not mean that there is no information disclosed at all. When a set of direct identifiers can be linked
with multiple identifiers from the anonymous database, the common information enclosed among the
anonymous identifiers can be associated with these direct identifiers (that is, if one is sure that all
observed subjects are listed in the anonymous database; see Figure B.5). It is important to understand
the full extent of this information leakage. If the attacker is only interested in the value of these common
variables, then he has achieved his goal, and his attack has been successful.
When applying the described re-identification or information recovery method, the table with possible
links, denoting which anonymous identifiers could correspond with which direct identifiers, should be
updated and re-evaluated every time partial information is recovered, or a subject is fully re-identified.
When applying the simple linkage algorithm which links records by evaluating the variables listed in
the anonymous and observations database directly [Figure B.6, linkage (1)], the following tables of
correspondence can be composed (illustrated by the lines between the two record sets in Figure B.6).
Table B.1 and Table B.2 represent how a direct identifier could correspond with an anonymous
identifier and vice-versa under this particular linkage rule. If more than one algorithm is used, they
should be evaluated together, which means that construction of the corresponding tables can become
quite complex.
It can immediately be seen that record A1 can only belong to Claire, which means that the anonymous
subject A1 is completely re-identified, and the attacker now knows that Claire has values (A, B) for the
two unobservable variables. Taking this into account, the attacker could update the linkage tables as
explained earlier, resulting in the following.
From the remaining un-identified records, an attacker cannot clearly identify any more persons. He can
however say that there is a 50 % chance that record A2 belongs to Alice or to Bob. In this case, this gives
little extra knowledge on the data subjects in a realistic (large) database, however, such a situation can
reveal useful information for the attacker.
Although no full re-identification is possible, there is still some information leakage, because
anonymous record A2 and A3 have the same value for variable 3. From that, the attacker can conclude
that both Alice and Bob have value B for variable 3. For the remaining variable 4, there is no information
retrieved on Alice or Bob.
Finally, if the information that an attacker wanted to recover about Alice, Bob and Claire was listed in
variable 3 only, then the attacker would have fully succeeded in his information recovery attempt. If
variable 4 had contained important information, he would have only succeeded partially. Finally, note
the fact that the value of variable 2 is now also known for Bob, although it could not be observed, thus
again, more information was recovered.
The illustrated model and procedures do not only apply to such simple data structures as the one used
in the example. Database records containing continuous variables, time-dependent information or a
mixture of several types of data also fit within the presented model. However, the implementation of
the corresponding linkage rules is much more complex.
Of course, there are some borderline cases. For instance, the observations could have a high uncertainty.
The anonymous database then serves as a verification mechanism.
subjects (one member of the A\O subset, another member of the O\A subset). A probability will have to
be associated with the link, a probability which will then be translated into a probability of membership.
The example in Figure B.8 illustrates this. Consider an anonymous database with three records. An
attacker wants to know if Claire is a member of that database and using the observations and a simple
linkage rule, there is a unique match between “Claire” and “anon-ID 1”. However, this does not mean that
the anonymous record belongs to “Claire”, as demonstrated on the figure, it was actually “Dave” who
was listed in the database. If the attacker has attributed a large probability to the unique link based on
his linkage rules (here the comparison of 2 variables), then he might draw the incorrect conclusion that
“anonymous identifier 1” is “Claire”.
Figure B.9 represents a process flow for pseudonymization of trial subjects. The step are as follows.
a) The clinical trial sponsor has a clinical trial ID. The details of the trial are not normally revealed
to candidates or healthcare providers. That would defeat the double blind purpose. Only enough
information to assess the candidate and assess the appropriateness of the trial and the risks are
disclosed.
b) The clinical trial site uses that to identify a candidate for the trial. This candidate is assigned a
number by the clinical trial sponsor (or by the clinical trial manager). This is often just a sequence
number of requests. The trial site has not disclosed anything more than an interest at this point.
c) The clinical trial site prepares a brief precis of the candidate medical record, using the clinical
trial number and subject number instead of identifiers. The precis includes only that information
needed for candidate evaluation.
d) The clinical trial sponsor (or manager) determines that the candidate is suitable and approves the
trial to proceed.
e) At various times, the clinical trial site sends data to the clinical trial manager. This information is
pseudonymized by removing unnecessary identifiers, removing some data and using the clinical
trial ID and subject ID from the original request. All data modification, substitution and deletion
are tracked in a detailed audit log, as is the data submission. This log is kept at the trial site.
f) These data are securely sent to the clinical trial manager. The receipt is logged and audited.
g) These data are further pseudonymized by further data removal. (The clinical trial site is aware of
more details regarding the trial, and can eliminate more data.) Also, data that could identify the
trial site are removed. This is audited with full data recovery.
h) This further pseudonymized data is analysed and aggregated to evaluate the trial results.
i) The aggregate data and supporting pseudonymized data are securely provided to the clinical trial
sponsor, drug reviewers, etc. This transmission is fully audited.
NOTE 1 There is no pseudonymization service. It is performed internally by the systems involved.
NOTE 2 There is no use of crypto techniques for generating pseudonyms. The use of arbitrary sequence
numbers assigned by the clinical trial sponsor is more robust. (Crypto-derived pseudonyms are highly vulnerable
to dictionary attack unless very carefully implemented. The list of names is rather short.)
NOTE 3 Data recovery requires multiple steps. The audit trails and various site records are used, rather than
embedding original data as an encrypted side payload. To track back to the original person, it is necessary to first
visit the clinical trial manager, then examine audit logs, then visit the clinical trial site to finally identify the person.
Annex C
(informative)
These entities can be complemented by, for example, authentication services, key escrow services or
other services required by the process model (see Figure C.1).
Key
ID service identification service
IDAT identifying data
PID personal identifier
PSN pseudonym
Figure C.2 shows typical data flow between the entities in the workflow model. These identify also the
message types that will be required for passing data or signalling the status of the operation through
acknowledgement messages. The extraction of the data and integration of the data is outside the scope
of this model and are handled by the source application and target application respectively.
The flow events include the following:
— request of data from the data target to be included in the identification and pseudonymization
requests and associated acknowledgement;
— request of domain patient identifiers for communication with the pseudonymization service and
associated receipt acknowledgement;
— transmission of the pseudonymization request;
— receipt acknowledgement by the pseudonymization services of the pseudonymization request;
— submit pseudonymized data to data target; when the pseudonymization process has been completed,
the data are sent to the target application(s);
— delivery acknowledgement: the data target acknowledges the receipt of the data (this process may
also include the result of checks, e.g. on the validity of the data format);
— if round trip acknowledgement is required, the pseudonymization service transmits the
acknowledgement to the person identification service that transmits an acknowledgement to the
data source.
In the case where exceptional events occur during the processing by the pseudonymization service,
exception codes are returned (e.g. data malformed, failed authentication of data source) to the source.
The structuring can be done by tagging the data elements, by creating a table with vectors to the data
elements or by putting the data elements in a pre-defined location (see Figure C.3).
The following preparations can be distinguished in accordance with the conceptual model on the levels
of anonymity.
— Data elements that will be used for linking, grouping, anonymous searching, matching, etc. shall be
indicated and marked in such a way that the pseudonymization service knows where to find them
and how to handle them.
— Depending on the privacy policy, convert elements that need specific transformations, e.g. for
changing absolute time references into relative time references, dates of birth into age groups, need
similar marking.
— Identifying elements that, according to the privacy policy, are not needed in the further processing
in the target applications, shall be discarded.
— The anonymous part of the raw personal data is put into the payload part of the personal data
element.
Figure C.4 shows the basic steps of a pseudonymization process which consist of the following.
a) In healthcare, the security of devices, in support of the confidentiality of patient data, is required
for privacy protection. For patients, a consideration involves the consideration of implanted
medical devices. Identifiable data on the device can be directly associable to the patient as can
other medical and personal devices (e.g. respiratory assistive devices). As such, a device and its
data may be able to identify a person. Healthcare devices assigned to a healthcare professional or
employee shall also be considered in identification risk assessment as it can identify the provider
or organization, and hence the patient.
b) The pseudonymization service parses the header and performs the tasks laid down in its policy.
This comprises the pseudonymization of data items, calculation of relative dates, removal of data
items and possible encryption of specific data items where this is defined in the privacy protection
policy. Consequently, the content of the payload is not visible to the pseudonymization service.
This is the preferable way of operation for the pseudonymization service. The policy can, however,
define otherwise, such that the content is parsed as well. Typically, the parsing is done to check for
unwanted content (e.g. identifiers in the data items). All checking is done in a “stateless” way. The
pseudonymization service does not store data it has previously processed, and therefore, it cannot
compare or check with that data. Only the current session can be taken into account.
NOTE The processing of pseudonyms and anonymization of payload data can be conducted through
separate entities and services.
c) Once the processing is done, the pseudonymization service sends the pseudonymized data to the
data repository over a secure channel.
d) When the repository application receives the data, it applies its business rules to it in order to
incorporate the data into the repository. This may include rules such as checking for double entries,
checking for missing entries, required acknowledgement procedures, etc.
Indeed, a pseudonymization service or trusted third party performing a pseudonymizing
transformation is necessary for trustworthy implementation of the pseudonymization technique across
multiple entities. There are three main reasons for this.
— As one communicating party does not always trust the other, trust can be established indirectly
because the two parties trust a third, independent party. Both parties are bound by a code of conduct,
as specified in a privacy and security policy agreement they agree on with the pseudonymization
service.
— Use of a pseudonymization service offers the only reliable protection against several types of attack
on the pseudonymization process.
— Complementary privacy enhancements technology (PETs) and data processing features can easily
be implemented.
EXAMPLE Controlled reversibility without endangering privacy, interactive databases.
Annex D
(informative)
Interoperability of pseudonymization services can be defined on several levels. A solution may, for
instance, make use of an intermediary pseudonymization service to perform the pseudonymization, or
may be an in-house module added to extraction software.
Pseudonymization services may be batch-oriented or may be built on the principle of pseudonymized
access of live databases.
A common service of many pseudonymization services consists of cryptographic transformations of
in-identifiable data.
This document however, only concentrates on the core mechanisms in which a pseudonymization
service is used.
IHE has written a handbook that describes the administrative and technical process for developing a
situation specific de-identification process. The various steps involved in establishing the situation,
setting requirements, creating the process and monitoring the ongoing process are described.
To assist in defining the details of de-identification, a menu of algorithmic categories is provided and
a list of generic kinds of data defined. A large matrix is provided that indicates which algorithms are
likely to be appropriate for de-identification of different kinds of data.
The procedural guidance and algorithm guidance is intended to reduce the effort involved in creating a
situation specific de-identification process.
Interoperability should cover the following elements.
— One or more mechanisms for exchanging the data between the entities in the model (source,
pseudonymization service, target) and for controlling the operation. This is less of an issue and
existing protocols can be used, such as html. Where needed, it is possible to design converters
between formats as part of pre- or post-processing.
— Choice of cryptographic algorithms. Pseudonymization based on cryptographic algorithms
will consist of a chain of basis cryptographic and related algorithms: hashing, random number
generators, key generation, encryption and basis logic bit-string functions.
— Key exchange issues.
Since there are many different contexts in which pseudonymization can be deployed, it is important to
restrict the context in which interoperability will be defined.
The common service of most pseudonymization services consists of cryptographic transformations of
in-identifiable data.
For two independent pseudonymization service providers to be interoperable, either of the following
should be possible:
a) to integrate each other’s data: data from the same date subject processed by any of the service
providers should be linkable to each other without direct re-identification of the data subject;
b) to convert the pseudonymization results from one or more service providers in a controlled way
without direct re-identification of the data subject.
Annex E
(informative)
E.1 General
It is important to complement the technical measures with appropriate non-technical measures. Such
non-technical measures are generally expressed through policies, agreements and codes of conduct.
— The controller of the data is responsible for re-identification of the patient, and may, as permitted by
local jurisdiction, validate further the re-identification request.
Annex F
(informative)
Genetic information
There are two schools of thought regarding genetic information. One point of view is that the genetic
information is different from other medical information. The second point of view considers the genetic
data in the same way as any other medical information. The point of view that genetic data are different
than other medical data, has been coined “genetic exceptionalism”.
In 2004, the European Directorate C (Science and Society), unit C3 (Ethics and Science) issued
twenty-five recommendations on the ethical, legal and social implications of genetic testing. The
recommendations are as follows.
— “Genetic exceptionalism” should be avoided, internationally, in the context of the EU and at the level
of its member states. However, the public perception that genetic testing is different needs to be
acknowledged and addressed.
— All medical data, including genetic data, should satisfy equally high standards of quality and
confidentiality.
On confidentiality, privacy and autonomy, the recommendations are as follows.
a) Genetic data of importance in a clinical and/or family context should receive the same level of
protection as other comparably sensitive medical data.
b) The relevance for other family members should be addressed.
c) The importance of a patient’s right to know or not to know should be recognized and mechanisms
incorporated into professional practice that respect this.
d) In the context of genetic testing, encompassing information provisions, counselling, informed consent
procedures and communication of test results, practices should be established to meet this need.
There are of course other opinions that do not agree with this vision. One argument, for instance, is that
the information content of genetic information collected in the context of population research is not
exactly known and that the information content of data in databanks or human tissue repositories may
be more sensitive than currently can be assessed.
Other groups do not follow this line of thought, such as the so-called “Montreux declaration” made at
the 27th International Conference of Data Protection and Privacy Commissioners in September 2005,
where, in the preamble, it is stated that: “Aware that the fast increase in knowledge in the field of
genetics may make human DNA the most sensitive personal data of all; Aware also that this acceleration
in knowledge raises the importance of adequate legal protection and privacy”.
Nevertheless, existing legislation, guidelines and publications on related subject matter that often
deals with the broader context of population genetic research and human tissue repositories shall be
considered where genetic information is part of the dataset to be pseudonymized. In generic terms,
data resources may be classified as either the identification of disease susceptibility genes or diagnostic
biomarkers.
Bibliography
[1] ISO 7498-2, Information processing systems — Open Systems Interconnection — Basic Reference
Model — Part 2: Security Architecture
[2] ISO/TR 21089, Health informatics — Trusted end-to-end information flows
[3] ISO/TS 22220, Health informatics — Identification of subjects of health care
[4] ISO/IEC 2382, Information technology — Vocabulary
[5] ISO/IEC 2382-81), Information technology — Vocabulary — Part 8: Security
[6] ISO/IEC 8825-1, Information technology — ASN.1 encoding rules: Specification of Basic Encoding
Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER) — Part 1
[7] ISO/IEC 15408-2, Information technology — Security techniques — Evaluation criteria for IT
security — Part 2: Security functional components
[8] ISO/IEC 18014-1, Information technology — Security techniques — Time-stamping services —
Part 1: Framework
[9] ISO/IEC 27033-1, Information technology — Security techniques — Network security — Part 1:
Overview and concepts
[10] ISO/IEC 29100, Information technology — Security techniques — Privacy framework
[11] ANSI X9.52-1998, Triple Data Encryption Algorithm Modes of Operation
[12] ENV 13608-1:2000, Health informatics — Security for healthcare communication — Part 1:
Concepts and terminology2)
[13] A function to hide person-identifiable information.” Computing for Health Intelligence, Bulletin 1
(version 3) 15 July 2002
[14] Rules Basic Encoding (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules
(DER). RFC-2313 PKCS #1: RSA Encryption, Version 1.5, March 1998
[15] Berman J.J. Confidentiality for Medical Data Miners. Artif. Intell. Med. 2002 November
[16] De Moor G., & Claerhout B. PPeP Privacy Protection in e-Pharma: Leading the Way for Privacy
Protection in e-Pharma, White Paper, 2003
[17] De Moor G., Claerhout B., De Meyer F. Privacy Enhancing Techniques: the Key to Secure
Communication and Management of Clinical and Genomic Data. Methods Inf. Med. 2003, 42
pp. 79–88
[18] Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on
the protection of individuals with regard to the processing of personal data and on the free
movement of such data
[19] El Kalam A.A., Deswarte Y., Trouessin G., Cordonnie E. A new method to generate and
manage anonymized healthcare information
[20] Hara K., Ohe K., Kadowaki T., Kato N., Imai Y., Tokunaga K. Establishment of a method of
anonymisation of DNA samples in genetic research. J. Hum. Genet. 2003, 48 (6) pp. 327–330
[21] Hes R., & Borking J. Privacy-Enhancing Technologies: The path to anonymity, Revised Edition,
Registratiekamer, The Hague, August 2000
[22] Roberts I. Pseudonymization in CLEF, PT001 (draft)
[23] IHE IT Infrastructure Technical Committee. De-Identification Handbook, Revised Edition,
IHE International, Inc, http://w ww.ihe.net/Technical_Frameworks/#IT, June 2014
[24] Ihle P., Krappweis J., Schubert I. Confidentiality within the scope of secondary data
research – approaches to a solution of the problem of data concentration. Gesundheitswesen.
2001, 63 () pp. S6–S12
[25] INFOSEC/TTP Project, Code of Practice and Management Guidelines for Trusted Third Party
Services, Castell, S., (Ed.), Ver. 1.0, Castell, Spain, October 1993
[26] INFOSEC/TTP Project, Trusted Third Party Services: Functional Model, Muller, P. (Ed.), Ver. 1.1,
Bull. Ingenierie, France, December 1993
[27] INFOSEC/TTP Project, Trusted Third Party Services: Requirements for TTP Services
[28] Langheinrich M. Privacy by Design — Principles of Privacy-Aware
[29] Lowrance W. Learning from experience: privacy and the secondary use of data in health
research, J. Health Serv. Res. Policy, Suppl 1: pp S1:2-7, July 8, 2003
[30] Privacy Enhancement PRIDEH in Data Management in E-Health. Final Report, Deliverable D4.4
Report Version 2.0. Technical Recommendations, Guidelines and Business Scenarios, July 2003
[31] NIH Publication Number 003-5388. Protecting Personal Health Information in Research:
Understanding the HIPAA Privacy Rule, http://privacyruleandresearch.nih.gov
[32] RFC-2630 Cryptographic Message Syntax, June 1999
[33] Schneier B. Applied Cryptography: Protocols, Algorithms, and Source Code in C. John Wiley,
Second Edition, 1996
[34] Ubiquitous Systems. Privacy by Design — Principles of Privacy-Aware. Available at: http://w ww
.vs.inf.ethz.ch/publ/papers/privacy-principles.pdf
[35] Westin A. Privacy and freedom, 1967
[36] NEMA PS3/ISO 12052, Digital Imaging and Communications in Medicine (DICOM) Standard,
National Electrical ManufacturersAssociation, Rosslyn, VA, USA. Available free at http://medical
.nema.org/
[37] PRIDEH-GEN, Deliverable D 2.1, Inventory Report on Privacy Enhancing Techniques. February
27, 2004.(IST-2001-38719)
ICS 35.240.80
Price based on 62 pages