HARAMAYA UNIVERSITY
COLLEGE OF COMPUTING AND INFORMATICS
DEPARTMENT OF INFORMATION SCIENCE
POSTGRADUATE (CEP PROGRAM)
COURSE TITLE: SEMANTIC WEB ANALYTICS
PROJECT TITLE: DEVELOP AN ONTOLOGY BASED SYSTEM FOR HUMAN
DISEASE DIAGNOSIS AND TREATMENT
Submitted by:
Name IDNO
1. Demissis Yirgalem …………Cpgp/0130/14
2. Abayineh Gemechu-----------CPgp/0301/14
3. Abebech G/Michael-----------CPgp/0133/14
4. Habtamu Minalu-------------CPgp/228/14
5. Misra Mohammed----------CPgp/0131/14
Submitted To: Abdalganiy K. (PhD)
Submission Date: August 9, 2022
Haramaya, Ethiopia
Contents
List of Figures...............................................................................................................................................iii
List of Abbreviations/Acronyms..................................................................................................................iv
Abstract..........................................................................................................................................................v
1. INTRODUCTION.................................................................................................................................1
1.2 Statement of the Problem...................................................................................................................1
1.3 Objective of the Project...........................................................................................................................2
1.3.1 General Objective.............................................................................................................................2
1.3.2 Specific Objectives...........................................................................................................................2
1.3 Significance of the Project.................................................................................................................2
1.5 Scope and Limitation...............................................................................................................................2
2. Literature Review.......................................................................................................................................2
2.1 Overview of Ontology.........................................................................................................................3
3. Methodology of the Project......................................................................................................................3
4. ONTOLOGY IMPLEMENTATION........................................................................................................6
4.1 RDF Description and OWL code representation of the Graph............................................................6
CONCLUSION............................................................................................................................................14
References....................................................................................................................................................15
ii
List of Figures
Figure 1:Class Hierarchy of the Disease Ontology......................................Error! Bookmark not defined.
Figure 2: Class Hierarchy with Properties of the Disease Ontology.............................................................5
Figure 3:RDF graph Representation of the project........................................................................................6
iii
List of Abbreviations/Acronyms
DIS: Disease Intelligence System
SWRL: Semantic Web Rule Language
ICD: International Classification Disease
WHO: World Health Organization
NCD: Nomenclature Code Disease
CQ: Competency Question
OWL: Ontology Web Language
W3C: World wide Web Consortium
WoT: Web of Things
DSS: Decision support System
RDF: Resource Description Framework
iv
Abstract
The purpose of our project is to develop an ontology based system for human disease diagnosis
and treatment. A disease-treatment ontology is being developed to model and represent
treatment information found in health system. Treatment information extracted from health
system and health related articles can then be encoded in this ontology and used for information
retrieval, question-answering, summarization and knowledge discovery. The ontology has
disease-treatment information and classes like disease, treatment, condition, effect, and evidence.
The sub-classes, properties and instances of these main classes are discussed with examples. We
also used apache Jena fuseki software to implement our project.
v
1. INTRODUCTION
The purpose of the ontology is to serve as a knowledge base to store the extracted information
and support these functions. The ontology is also expected to be useful in supporting synthesis of
information extracted from different publications, and inferencing of potentially new relations
between chemical substances and effects on diseases, such as envisaged by Swanson and others.
The ontology based disease information system is being build in semantic based rules designed
to respond to the corresponding user query. The designed information system is mainly focusing
on improving the query results and also support ease of use users (Pattabiraman et al., 2012).
“Ontology is designed for storing information about rapidly spreading and changing diseases
with incorporating existing disease taxonomies to genetic information of both humans and
infectious organisms”(Abeysiriwardana & Kodituwakku, 2012). Ontologies are relatively a new
way of defining and storing knowledge and it describes a certain domain by dividing it into
several concepts(Arasu, 2017).
1.2 Statement of the Problem
There are different problems which we have decided to solve while we develop our project some
of them are: Data related to different area of interest in disease domain is not interconnected and
different related data are also not logically arranged to be processed by machine. To handle
these problems it is necessary to interconnect key sub areas of disease domain and connect them
and their data logically within the developed disease ontology. A few key relationships used to
interconnect those key areas are has Structure and has Symptoms. Evaluations performed
through a demo dataset demonstrated the effectiveness of the ontology. SWRL rules were used
to define accurate axioms, improving the correct classification and inference(Larentis et al.,
2021).
1
1.3 Objective of the Project
1.3.1 General Objective
The general objective of our project is to develop ontology based system for Human disease
diagnosis and treatment.
1.3.2 Specific Objectives
To achieve the main objective of our project we have the following specific objectives:-
To find out a proper way to extract the information about rapidly spreading and changing
diseases.
Make ontology to extract the information about those rapidly spreading and changing
diseases using a proper web semantic language.
Make information extraction and other natural language processing tools, key enablers
for the acquisition and use of that semantic information.
Propose / lay a foundation for the Disease Intelligence System (DIS).
1.3 Significance of the Project
Ontology based system is very important for health care to easily link related disease
diagnosis and treatment and it simplify work for experts and patients or the end users.
1.5 Scope and Limitation
The scope of the ontology we used is defined within the scope of prevention and follow-up of Human
diseases. The ontology can be used by information systems that support people in their health and quality
of life process.
2. Literature Review
While we are doing our project we tried to read different related articles, thesis, books and other
sources. The use of ontologies to support healthcare systems is an emerging field of research.
Some works involve collaborative efforts in the development of ontology and semantic web
systems involving International Classification of Diseases (ICD-11) defined by the WHO. Other
works involve ontologies for smart home focused on assisted living solutions. These solutions
provide support to the residents and also consider issues such as disabilities and/or impairments
related to the ageing of the person. In the area of technology applied to healthcare in NCDs,
several pieces of research have also focused on the creation of ontologies. Currently, there is no
ontology that defines disease class hierarchies, symptom class hierarchies, and establishes
relations between disease and symptom classes(Mohammed et al., 2012).
2
2.1 Overview of Ontology
An ontology refers to the representation of a given knowledge domain, which must be formal,
shareable, and composed of well-defined concepts and rules. The scope of the ontology can be
defined using a list of competency questions (CQ). These questions need to specify the
requirements that the ontology should be able to answer. The finding for the answers to questions
can be possible through reasoners (Larentis et al., 2021).
According to Gruber, an ontology is an explicit specification of a conceptualization, so when
defining a domain of interest with ontologies the objective is sharing knowledge, specifying a
representative vocabulary in a domain, and understanding the semantics of the domain data.
Therefore, the author understands that generic ontology can help to reason about a specific
knowledge domain by mapping the most general concepts and relations. Ontology can be
represented by Ontology Web Language (OWL), a language standard of the World Wide Web
Consortium (W3C) (https://www.w3.org/, accessed on 10 October 2021). An OWL ontology
contains four basic elements: concepts, instances, properties, and restrictions. Concepts are used
to identify the objects in the ontology definition, described through classes. Instances define the
individuals of the classes. Individuals can be related to other individuals through properties.
Properties are binary relationships used to define the relationships among the individuals of a
class or among itself. Constraints are used to define boundaries for the individuals that belong to
a class. The contents of an ontology can be checked using the SPARQL query language.
Inferences about classes and individuals can be made using Semantic Web Rule Language
(SWRL). Thus, ontologies are used to represent domains and infer knowledge, from construction
methodologies in several areas, such as the Web of Things (WoT) and Decision Support
Systems (DSS) , among others. This
3. Methodology of the Project
The methodology used to achieve the disease intelligence through web is based on ontologies
created using OWL (Web Ontology Language) as well as evaluated by the reasoners available
today. The ontology created here is named as disease ontology and it serves as a means to
structure the disease domain.
As multiple inheritance can be achieved and checked through reasoning techniques applied to the
ontology (developed using OWL 2 language) while it is in the developing stages, top down
3
approach has more advantages over bottom up approach. As a result, we have used top-down
approach method while we develop our project. The top-down approach is followed in modeling
the domain of diseases. So the concepts developed at the beginning are very generic and
important.
The most generic concepts considered here are Disease, DiseaseArea, DiseasePrevention,
DiseaseStructure and DiseaseSymptoms. Huge amount of data related to these concepts already
exists.
The resulting base disease ontology has 6 most generic classes or concepts namely Disease,
DiseaseArea, DiseaseSymptoms, DiseasePrevention, DiseaseStructure and GeneticMaterial. The
root class of these six classes of the Disease ontology is the Thing class. OWL classes are
interpreted as sets of individuals (or sets of objects). The class Thing is the class that represents
the set containing all individuals. Because of this all classes are subclasses of Thing. The
proposed Disease ontology has the following tree structure shown in Figure 1
Figure 1:Class Hierarchy of the Disease Ontology
4
Here under also we discussed about different classes, properties , domain and range of the
different disease. The proposed Disease ontology has defined domains and relevant ranges as
well. For example, the domain and range for the hasSymptoms property are Disease and
DiseaseSymptoms classes respectively. The domain and range for isSymptomsOf is the domain
and range for hasSymptoms swapped over. Although the domains and ranges of hasSymptoms
and isSymptomsOf properties are specified, it is not advisable to doi it over other properties of
the Disease ontology without further studying those properties and classes covered by them. The
reason behind this is that domain and range conditions do not behave as constraints. So they can
cause 'unexpected' classification results which lead problems and unexpected side effects.
Also the proposed Disease ontology has restrictions. If a disease is there, at least a symptom
should be there to indicate that the disease exists. Here an 'existential restrictions’ is used to
describe individuals in Disease class that participate in at least one relationship along a
hasSymptoms (some) property with individuals that are members of the DiseaseSymptoms class.
These restrictions are applied to the properties depicted by the dotted arrows in Figure 2
Figure 2: Class Hierarchy with Properties of the Disease Ontology
5
4. ONTOLOGY IMPLEMENTATION
The implementation of ontology was made relying on the framework based on knowledge and ontology
editor called Protégé (Arasu, 2017). The ontology was encoded using the Web Ontology Language in its
version 2.0, which provides classes, properties, individuals and data type values .This language is stored
as documents under the Semantic Web standards.
4.1 RDF Description and OWL code representation of the Graph
RDF graph is the overall direction indicators of the whole activities; due to this we draw RDF
graph and we also convert the RDF graph in to OWL code representation.
Figure 3:RDF graph Representation of the project
Similarly, we also wrote the OWL code of the graph as Follows:-
@PREFIX owl: <http://www.w3.org/2002/07/owl#>
@PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
@prefix xsd:<http://www.w3.org/2000/01/rdf-schema#>.
6
: a owl:Ontology .
# Major Classes and sub classes
foaf:Person a owl:Class .
:Patient rdfs:subClassOf foaf:Human .
foaf:disease a owl:Class.
:Heart disease rdfs:subClassOf :disease.
:Kidney disease: rdfs subClassOf: disease.
:Lung disease: rdfs subClassOf: disease.
:cancer disease: rdfs subClassOf: disease.
foaf: symptoms a owl: Class.
:Symptoms list: rdfs subClassOf: Symptoms.
foaf: Diagnosing a owl: Class.
:Diagnosing technique: rdfs subClassOf: Diagnosing.
foaf: Treatment a owl: Class.
:Treatment list:rdfs subclassOf: Treatment.
# Properties and thier description
foaf:patientID rdf:type owl:DatatypeProperty ;
rdfs:domain foaf:Human ;
rdfs:range rdfs:Literal .
foaf:firstName rdf:type owl:DatatypeProperty ;
rdfs:domain foaf:Human ;
rdfs:range rdfs:string .
foaf:lastName rdf:type owl:DatatypeProperty ;
rdfs:domain foaf:Human ;
rdfs:range rdfs:string .
foaf:age rdf:type owl:DatatypeProperty ;
rdfs:domain foaf:Human ;
rdfs:range rdfs:Literal .
7
foaf: sex rdf: type owl:DatatypeProperty;
rdfs:domain foaf:patient ;
rdfs:range rdfs:Male/Female .
In our project we implement by downloading and uploading that dataset to Apache Jena fuseki. The
procedures are mentioned as follows:-
8
9
10
Now we have uploaded file in apache Jena fuseki. Next let query it as follows:-
11
The output of the above query will be:
12
We also implemented our dataset by using OWL language:
The output of the above OWL query will be :
13
CONCLUSION
The modeling of knowledge directed to the lifelong healthcare of people allows the inter action
between systems and people, whether to prevent or control the diseases. What is expected from
the designed disease ontology and how it will effectively be evolved to form a kind of
intelligence that will pave way to disease intelligence is the main theme of discussion. This
Ontology based system has many benefits for health care. It provides best extracted information
about rapidly spreading and changing diseases. In addition to that, this information will make
information extraction and other natural language processing tools key enablers for the
acquisition and use of this semantic information. The designed Disease ontology should be
further developed by the community, once it is available in the web by the means of adding new
concepts, refining the existing concepts and adding data/ information to the disease ontology.
14
References
Abeysiriwardana, P. C., & Kodituwakku, S. R. (2012). Ontology Based Information Extraction
for Disease Intelligence. International Journal of Research in Computer Science, 2(6), 7–
19. https://doi.org/10.7815/ijorcs.26.2012.051
Arasu, G. T. (2017). DEVELOPMENT AND VALIDATION OF ONTOLOGY BASED
KNOWLEDGE REPRESENTATION FOR BRAIN TUMOUR. 5(2), 23–28.
Larentis, A. V., Neto, E. G. de A., Barbosa, J. L. V., Barbosa, D. N. F., Leithardt, V. R. Q., &
Correia, S. D. (2021). Ontology-based reasoning for educational assistance in
noncommunicable chronic diseases. Computers, 10(10), 1–23.
https://doi.org/10.3390/computers10100128
Mohammed, O., Benlamri, R., & Fong, S. (2012). Building a diseases symptoms ontology for
medical diagnosis: An integrative approach. 1st International Conference on Future
Generation Communication Technologies, FGCT 2012, December, 104–108.
https://doi.org/10.1109/FGCT.2012.6476567
Pattabiraman, V., Sivakumar, R., Thirugnanam, M., & Ramaiah, M. (2012). Ontology based
disease information system. Procedia Engineering, 38, 3235–3241.
https://doi.org/10.1016/j.proeng.2012.06.375
15