Model For Identification of Politically Exposed Persons
Model For Identification of Politically Exposed Persons
Zane Miltina1, Arnis Stasko1, Ingars Erins1, Janis Grundspenkis1, Marite Kirikova1,
and Girts Kebers2
1
Riga Technical University, Riga, Latvia
{zane.miltina,arnis.stasko}@edu.rtu.lv,
{ingars.erins,janis.grundspenkis,marite.kirikova}@rtu.lv
2
SIA “Lursoft IT”, Riga, Latvia
[email protected]
1 Introduction
PEP status identification that would be able to analyse, extract and interpret
information combined from various data sources. The purpose of this paper is to
amalgamate and structure the requirements for PEP identification and to develop the
PEP status identification model that meets these requirements.
In the study underlying this paper the design science approach was followed,
which provides both new knowledge and new artefact(s) [1; 2]. Identification,
collection and systematic analysis of the scientific works have been carried out to get
new knowledge of the existing possibilities of PEP status identification. This
knowledge forms the basis for the development of a new artefact – a PEP status
identification model.
This paper proposes PEP status identification model based on the multi-agent
approach, which enables combining various approaches, methods and algorithms
depending on their specialisation and available data sources. PEP status identification
model offers high degree of practical applicability in solving complex problems,
ensured by flexible combination of specialised agents.
The paper is structured as follows. The scope definition is provided in Section 2.
The problem definition is given and related work is discussed in Section 3. In Section
4 the requirements relevant in PEP status identification model development are
presented. The multi-agent PEP status identification model is proposed in Section 5.
The model enables a holistic and systematic approach to PEP status identification.
The proposed solution includes PEP status identification and validation mechanisms,
points out the needed data sources for obtaining PEP identification information, and
defines the knowledge and research areas necessary for the PEP status identification
and validation. The proposed model is evaluated in Section 6. Conclusions are stated
in Section 7.
2 Scope Definition
In order to ensure compliance with the requirements of the Law on the Prevention of
Money Laundering and Terrorism Financing [3] financial institutions are expected to
develop an internal control system that covers customers identification, customers
evaluation for their financial transactions and behaviour pattern fitting the type and
the specifics of the business that the particular customer is operating in as well as
monitoring and reporting suspicious transactions to the relevant regulatory authorities.
Law [3] stipulates obligation for financial institution not only to identify the PEPs,
but also to identify the PEP related parties, family members, friends and relatives who
could affect actions done by PEP. In this paper, for simplicity, where appropriate, the
statuses of “PEP family members” and “persons closely related to PEP”, as noted in
the Law [3], are commonly referenced as PEP related persons or PEP related parties.
Various sources define and elaborate on PEP identification requirements thus
expanding the PEP definition and related requirements. The Financial Action Task
Force (FATF) defines the PEP as an individual who is or has been entrusted with a
prominent public function [4]. Due to their position and influence, it is recognised that
many PEPs are in positions that potentially can be abused for the purpose of
committing money laundering offences and related predicate offences, including
3
Identification of PEPs is one of the major challenges in the field of national and
financial security. Researchers and analysts have endeavoured to meet this challenge
mostly by trying to create a comprehensive PEP and PEP related party status
identification models. Recommendations for Credit Institutions and Financial
Institutions to Establish and Research Politically Exposed Persons, their Family
4
to reuse the solution is low, as the mechanism might return the results that do not
correspond to the interpretation applied in the reuse situation.
As a result of the study of PEP status identification problem the following main
challenges and their possible solutions were identified:
Ambiguously interpretable data: not always the information available in data
sources is sufficient for clear/unmistaken identification of the unique person. It
refers especially to information in such data sources as social networks, portals,
newspapers, etc.; and to information on relationships with closely related persons
of a PEP. According to the regulatory requirements of Latvia [12] the only way
to uniquely identify a person is by using a personal identification number as well
as name and surname combinations. However, such data is not always available
in one particular data source; therefore, different combinations of data (or data
fusion) should be used to obtain needed information.
Duration of PEP status: The Law states that the PEP status expires in either of
two events: a PEP dies or a PEP does not hold a major public position for at least
12 months and his/her business relationships no longer create an increased risk of
money laundering [3, Section 25, Paragraph 5]. This means that after 12 months
PEP status identification should be repeated and if there are no other evidences
found that a person is a PEP, then the PEP status can be removed. Therefore,
when designing PEP status identification model, the time factor needs to be
incorporated along with the PEP status and repeated status checks scheduled.
Identification of a motive: besides the list of clearly defined family members,
the Recommendations [4; 5] stipulate that persons, which are substantially
considered to be family members (such as an actual cohabiting partner), but
formally are not included in the list of family members; persons outside the
family (such as friends, girlfriends, lovers) can be PEP related persons, as well.
This complicates the model design as these persons are difficult or even
impossible to detect as these relationships are not registered. Moreover, the
motive has to be identified: “There is the trust and commitment between such
persons, which can serve as grounds for the PEP to hide his/her abuse of public
power for personal benefit by the help of such person” [4; 5]. Since there are no
registers of cohabitation partners, such information has to be searched in social
networks or media, which are both low reliability data sources.
Identification of close relationship with PEP: Recommendations [5] stipulate
that a person, who has business relationships with the PEP, has to be granted with
a status of a “person closely related to PEP”. Several areas are to be of particular
interest: real estate deals, procurement deals of state, business partners of the
PEP, as well as those who share participation in a company’s board or council
with the PEP. For the purpose of the model creation, business relationships,
business areas, transaction types, transaction parties and the data sources for
obtaining such information have been identified. In order to identify high-risk
6
transactions, the criteria can be used for the transaction amount. Existing
regulatory requirements do not stipulate a specific transaction amount for the
business relationships with a PEP to serve as a grounds for a person becoming
closely related to a PEP. On the other hand, to minimise the number of false
negatives, there are transaction limits recommended for each type of transactions.
When designing the PEP status identification model, it is important that
transaction types and business areas can be combined with transaction amount
criteria, which can be defined by the user.
The above-discussed challenges show that the PEP identification process will
require a combination of different methods and tools that have to be systematically
organized for achieving the goal of identification of PEPs. To have a systematic view
on the PEP identification process, the PEP status identification model that
amalgamates the methods, algorithms and tools and prescribes their potential
sequences of usage is necessary. Table 1 summarizes the requirements towards PEP
status identification model based on regulatory requirements, the complexity of PEP
definition interpretation, and the results of analysis of related work.
It is important to point out that the paper reflects results obtained during the initial
stages of the research the final goal of which is to develop an adequate model that
satisfies all requirements summarised in Table 1 and provides an optimal sequence of
methods, algorithms and tools for processing data about PEPs, their family members
and PEP closely related persons. The search for appropriate approaches, methods,
algorithms and tools to ensure identification of PEPs with high degree of reliability
has been carried out and results are briefly presented in the next section.
To develop the model, the design science principles of obtaining new knowledge and
new artefact (the PEP status identification model) were followed. After analysis of the
related work, the problem domain was deeply analysed, especially with respect to the
data available and reliability of data sources. Data analytics methods and algorithms
(potentially applicable in the PEP identification domain) were amalgamated and
analysed. Further, candidate approaches from the related work, namely ontology
based approaches [13; 14], data fusion based approaches [15; 16] and natural
language processing algorithms for text analysis taking into account language specific
diacritical marks, declinations and grammar elements [17] were applied to the PEP
identification problem [18]. On the basis of the obtained knowledge it was concluded
that the identification of PEPs is a complex decision making problem over distributed
knowledge domain. Influenced by the contention made by Padgham and Winikoff,
who in their book [19] referencing Jennings [20] stated that “agents are ‘well suited
for developing complex distributed systems’ since they provide more natural
abstraction and decomposition of complex ‘nearly-decomposable’ systems”, it was
chosen to model potential solution with the multi-agent approach. To ensure the
production of a rigid agent-oriented solution the Prometheus methodology [19] for
modelling was chosen.
7
recommendations [5], the model has to identify also the family members and persons
closely related to PEP. The PEP family members are visualized in Fig. 1.
Assuming that family links are permanent and the number of PEPs is
comprehensible in Latvia (10000-15000 persons [21]), it was found useful to prepare
a PEP database containing PEPs and closely related persons (family members and
family members placed on the same level to those persons) in advance. Such
approach requires compiling a single database of all PEP lists, adding the
identification information from various related data sources (to enable unambiguous
identification of the person in different data sources) and, for all persons in PEP lists,
identification of closely related parties (Fig. 2), thus distinguishing five types of
external data sources.
The system's goals are a natural construct to initiate system specification [19]. This
refers also to the building of models. To fulfil the PEP identification requirements, the
following two main goals have to be met – maintain PEP database and determine the
person’s PEP status upon a request. According to the Prometheus methodology
detailed goal models were constructed for both above-mentioned goals ( Fig. 3 and
Fig. 4).
For the PEP database maintenance goal ( Fig. 3) four sub-goals were found,
namely, maintain data source database, summarize the PEP list from data sources,
summarize related family persons from data sources, and identify the persons
unequivocally. These sub-goals where further decomposed where appropriate.
The diagram for the goal “determine the persons PEP status upon a request” is
shown in Fig. 4. To support the goal, it is necessary to handle requests, identify
10
whether person is a PEP, identify the family relations with a PEP, and identify
transaction relations with a PEP.
After goal identification, the PEP identification functionalities were established.
Conforming to Prometheus methodology, the functionality is a term used for “a chunk
of behaviour, which includes a grouping of related goals, as well as percepts, actions
and data relevant to the behaviour” [19]. Furthermore, the decision about agent
classes was obtained by combing the functionalities. According to Prometheus
methodology “The choice of how functionalities are to be combined is made by
considering the functionalities and scenarios and developing possible groupings of
functionalities into agents, which are then evaluated according to the standard
software engineering criteria of coupling and cohesion" [19]. The obtained agent
classes are shown in Fig. 5.
Symbols:
To support the identification of PEPs, three main scenarios were established – PEP
list maintenance scenario, PEP relation maintenance scenario and request handling
scenario. During the setup of the system that will correspond to the above-described
model the PEP list maintenance scenario and the PEP relation maintenance scenario
are expected to be run sequentially. After the setup any of three scenarios can be run
independently, corresponding to the schedule or on a request.
The PEP list maintenance scenario includes data source list update, PEP list
renewal from data sources, additional person identification information gathering,
data fusion, and, finally, ambiguous data screening for approval. The PEP relation
maintenance scenario operates similarly: it starts with the data source list update, then
continues with PEP family member list renewal from the data sources, the additional
person identification information gathering, the data fusion, and, finally, the
ambiguous data screening for approval.
The request handling scenario expects that the PEP list maintenance and the PEP
relation maintenance scenarios have been executed successfully at least once and the
PEP database has been set up. Subsequently, by receiving user request, the request
handling scenario checks if requested person can be found in the PEP database, if the
person under investigation is related to the persons in the PEP database and if the
person under investigation has any transactions with the persons in the PEP database.
Finally, complex decision about the requested person’s PEP status is made and results
are returned to the user.
From challenges and complexity of the PEP definition and the regulatory
requirements discussed in this paper it is evident that for obtaining a holistic and
effective PEP identification model no one single approach exists that can be used for
satisfying all of these requirements stated in Section 4. Instead, a combination of
different methods and approaches systematically organized in a model has to be used.
The proposed model is based on the multi-agent paradigm, which allows
structuring the complex problem onto several agents. Consequently, each agent is
dedicated for solving specific task and it is equipped with selected methods and
approaches, such as data fusion, ontology, natural language processing and various
information retrieval and data mining algorithms and methods.
In order to verify the proposed model, it will be assessed against the PEP status
identification model requirements set out in Section 4 of this paper.
During PEP status identification model design, PEP definition requirements were
translated into detailed search tasks and possible data sources for each search task
were selected. Methods for natural language processing [17] were prescribed by the
model to enable search and retrieving of data that contains language specific
diacritical marks, declinations and grammar elements, thus ensuring natural language
processing capabilities. This is particularly important for handling various
unstructured data sources, which is currently the only source for identification of
persons closely related to a PEP.
13
When examining available data sources, it was observed that there was a number
of data quality and consistency issues as well as there was a number of areas where
the data was not structured and were incomplete. Data-fusion algorithms [15; 16]
prescribed by the model are capable of compiling the obtained data, discarding
insufficient quality data and involving the expert in the process of ambiguous data
review and approval. In order to operate with data of various reliability, the model is
able to consider the information reliability factor.
The model provides expert involvement in PEP database preparation process in
terms of minimal use of expert time by limiting the expert involvement only to critical
assessment functions where human intervention is required in interpretation of the
received data. PEP status identification model is built in a way that the expert
involvement is not needed in real-time operation, but rather in PEP database
establishment and support, which enables to provide PEP status investigation results
in real-time.
The proposed model encompasses agents specialised on ontology reasoning
methods for rule-based PEP status evaluation. The proposed multi-agent model also
assumes maintaining a database, which stores all previously identified PEP persons
along with the family relationship and business relationship lists. Addition of an
internal database allows faster status identification due to re-using data already
obtained and searching the local database only, instead of real time data search in
external data sources.
For several agents there are more than one options of how the specific task can be
approached. Currently the experiments are carried out to select the most efficient
alternatives with respect to better PEP identification, fewer process steps, less time
and lesser human expert involvement, and identification of the most efficient
sequence of the tasks in the model.
In comparison with related works and analysed patents, the proposed PEP status
identification has clearer structure that shows both the main data sources and main
functions to be performed over the data, thus, the model has higher potential of
reusability than solutions discussed in Section 3. It has the ability to cover the totality
of regulatory requirements and can be relatively easy changed in case of changes in
regulations due to modular structure of the model. The model allows to work with
various country specific data sources and language specific diacritic and grammar
elements and to retrieve and interpret data from unstructured data sources and, thus, to
obtain as complete set of persons identification data as possible, that in turn allows for
identification of a person under investigation in various sources and get as precise
PEP status identification results as possible with the given sources.
7 Conclusion
This paper focuses on creation of PEP status identification model for the specific case
of the Republic of Latvia in accordance with the regulatory requirements set out in the
definition of the status of the PEP, as well as the data sources of Latvia. However,
since in the Republic of Latvia the PEP status identification and recognition
methodology is developed in accordance with the international requirements (FATF
14
definitions) and aligned with European Union directives, the findings of presented
work are applicable also for other countries in Europe. For reusing the PEP status
identification model, the sources of information and unique parameters should be
adjusted to country specific data as well as the local regulatory requirements should
be verified to ensure the completeness of the PEP status identification model.
The developed PEP status identification model suggests possible usage sequences
of approaches, methods and algorithms for extracting and processing data pertaining
to PEPs and their related parties in order to ensure identification of PEPs with a high
degree of reliability. Different data analytics methods and their combinations were
investigated during the model development process. A multi-agent approach was
considered [18; 22] as one of the potential solutions besides the ontology based and
data fusion based approaches. The discussion of other approaches as stand-alone
solutions is out of the scope of this paper as number of limitations were discovered in
their ability to satisfy the PEP status identification model requirements set out in
Section 4 of this paper. While ontology based and data fusion based approaches
proved to be efficient in solving parts of the PEP definition based requirements, the
multi-agent approach gave an opportunity to systemically cover all requirements.
The proposed PEP status identification model based on the multi-agent approach
demonstrates the suitability for PEP identification task. The model not only allows to
incorporate algorithms and methods in a single solution but also leaves space for
different algorithm selection for every agent and task, and gives an opportunity to
change existing and involve new data sources.
While having developed the multi-agent model for PEP identification further work
is to validate different algorithms and methods for each task. Best algorithms and
methods will be selected by testing various algorithm and method combinations and
sequences and measuring overall solution performance according to the main
scenarios of PEP identification.
Acknowledgment: The research leading to these results has received funding from
the research project "Competence Centre of Information and Communication
Technologies" of EU Structural funds, contract No. 1.2.1.1/16/A/007 signed between
IT Competence Centre and Central Finance and Contracting Agency, Research No.
1.14 “Development of Data Processing Algorithm Flow Optimisation Model for
Identification of Politically Exposed Persons".
References
1. Johannesson, P., Perjons, E.: “A Method Framework for Design Science Research” in An
Introduction to Design Science, Cham: Springer International Publishing, 75–89 (2014)
2. Wieringa, R.J.: Design Science Methodology for Information Systems and Software
Engineering. Berlin, Heidelberg: Springer Berlin Heidelberg (2014)
3. Law on the Prevention of Money Laundering and Terrorism Financing (in Latvian -
Noziedzīgi iegūtu līdzekļu legalizācijas un terorisma finansēšanas novēršanas likums),
https://likumi.lv/doc.php?id=178987 [Accessed: 23-Feb-2017]
4. FATF Guidance politically exposed persons (recommendations 12 and 22),
http://www.fatf-gafi.org/media/fatf/documents/recommendations/guidance-pep-rec12-
22.pdf [Accessed: 30-Mar-2017]
15