0% found this document useful (0 votes)

32 views24 pages

2 - Clinical Data Lecture

asdasd

Uploaded by

Moatez Hassan Khalil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views24 pages

2 - Clinical Data Lecture

asdasd

Uploaded by

Moatez Hassan Khalil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Clinic Data: Sources, Preparation,

and Feature Operations

Textbook: An Overview of Data Collection, Preprocessing, and Feature Extraction in AI
(Chapter 3)
Artificial Intelligence in Medicine
A Practical Guide for Clinicians
Index:
1. Clinical Data meaning.
2. Clinical data sources.
3. Clinical Data Preprocessing.
4. Clinical Data Handling.
5. Feature extraction and selection.
Introduction:
• Data can be seen as a treasure for AI systems.
• Data is the foundation for training and developing AI models to
assist healthcare professionals in various aspects of patient care.
• Each piece of data holds valuable insights, stories, and facts that
contribute to the AI system’s understanding of healthcare.
• Data forms: structured data like organized rows and columns in
spreadsheets, unstructured data like text documents, images,
and audio recordings.
• For example, AI systems can analyze medical images to detect
diseases, text data can be processed to extract clinical insights
and patient records can be analyzed to predict health outcomes.
Introduction:
• Data is processed by computers using algorithms and logical
operations to produce new data or meaningful output based
on input data. (data -> information -> knowledge).
• In a healthcare context, data refers to information related to
health conditions, including clinical metrics, and clinical
outcomes, as well as environmental, socioeconomic, and
behavioral information pertinent to health and wellness. AI
algorithms to learn patterns, make predictions and provide
valuable insights to support clinical decision-making
(Vaccination campaigns).
• By analyzing data related to patient flow, staffing, scheduling,
and supply chain management, AI systems can help
hospitals optimize their operations and improve the quality of
care.
Machine Learning Steps:

ACQUIRE PREPARE ANALYZE REPORT ACT

Goal: data collection is the first

step where scientists gather all the
necessary medical information
from different sources before
being processed in an AI system
Acquire Data
Identify data sources
Collect data
Integrate data
Data
Data Sources:
1. Electronic Health Records (EHRs): EHRs contain comprehensive patient health information,
including medical history, diagnoses, treatments, laboratory results, and medications. These
records are collected and stored by healthcare providers and hospitals during patient visits.
Data Sources:
2. Medical imaging:
Medical imaging data such as X-rays, MRIs, CT scans, and ultrasounds
provide visual representations of the patient’s anatomy and can help diagnose
and monitor diseases. Images are captured using specialized equipment and
stored in digital formats.
3. Wearable Devices and Internet of Things (IoT) Devices:

With the increasing popularity of wearable devices, such as fitness trackers

and smartwatches, physiological data like heart rate, activity levels, and sleep
patterns can be collected continuously. IoT devices, such as remote
monitoring systems and sensors, also contribute to the collection of patient-
generated data outside of healthcare facilities.
Data Sources:
3. Clinical trials and research studies:
Research studies and clinical trials collect data from participants to investigate the
effectiveness and safety of new treatments, interventions, or medical devices. These studies
generate valuable data that can be used for AI analysis and to improve patient care.
4. Health apps and patient portals:
Mobile health applications and patient portals allow patients to record and track their
health information, such as symptoms, vital signs, medication adherence, and lifestyle habits.
These apps enable individuals to actively participate in managing their health and contribute to
the collection of personal health data.
5. Social media and online communities:
Social media platforms and online communities provide a wealth of health-related
information and patient experiences. Analyzing these unstructured data sources can uncover
insights and trends that contribute to AI-driven healthcare improvements.
ACQUIRE PREPARE ANALYZE REPORT ACT

Step 2-A: Explore

Step 2-B: Pre-process

Why Explore?

Goal: Understand your data

Describe Your Data
Visualize Your Data
Histogram Heat map

Line plot Scatter plot

Why Explore? Outliers

General trends
Correlations
Time Series Def.
• A time series is a sequential set of data points, measured typically
over successive times
• A time series containing a single variable’s values is termed
univariate. But if values of more than one variable are considered, it
is termed a multivariate
• A time series can be continuous or discrete.
• Continuous time series: observations are measured at every instance
of time. EX: temperature readings, river flow, and rate of illness
spread.
• Discrete time series: observations are measured at discrete points in
time. Examples include city population, company production,
exchange rates, number of patients, and number of required beds.
Time Series Compounds:
Time Series

Trend Component Seasonal Cyclical Irregular /

Component Component Random
Component

Overall, Regular periodic Repeating swings Erratic or residual

persistent, long-term Fluctuations, or movements, fluctuations
movement, up or Short-term cycle length (y, m,
down, linear or non- regular wave- d ), peak to peak
linear like patterns

1
6
Time Series
Compounds:
Data Trend:

Mann-Kenda trend
test

tau +/- ve
p. Value
Pettit test:

point of
change
Spearman’s Rho
test:

correlation between
the time series and
the data values
Theil-Sen’s slope:
determine the
magnitude of the
trend (rate of
change) of the
climatic variable.

𝑆𝑙𝑜𝑝 ∗ #𝑦𝑒𝑎𝑟𝑠 ∗ 100

𝑚𝑒𝑎𝑛
Clinical Data Handling: Handling missing data

• Deletion: If the amount of missing data is relatively small, the rows or

columns containing missing values may be removed.

• Mean/mode/median imputation: Missing values are replaced with the mean,

mode, or median value of the corresponding feature.

• Forward/backward fill: Also known as “last observation carried forward” or

“next observation carried backaward,” this method involves filling missing
values with the previous or subsequent non-missing values in the dataset. It
is commonly used in time series data where missing values are expected to
have similar patterns.
Clinical Data Handling: Handling missing data

• Interpolation: Interpolation methods estimate missing values based on the

values of neighboring data points. Common interpolation techniques include
linear interpolation, polynomial interpolation, and spline interpolation.

• Multiple imputations: Multiple imputations generate multiple plausible

values for each missing data point based on the observed data’s distribution.
The missing values are then replaced with these imputed values and the
analysis is performed multiple times using each imputed dataset. This
method accounts for uncertainty in imputation and produces more robust
results.
Clinical Data Handling: Handling imbalanced data
• In certain applications, datasets may be imbalanced, meaning that one class
or category is significantly more prevalent than others. The idea to combat
the challenge of imbalanced data is random sampling.

• Oversampling — Generate new samples for the under-represented class.

• Undersampling — Remove samples from the class which is over-

represented.

• Oversampling or undersampling works to balance the representation of

different classes and prevent biases in model training and evaluation.

• If a dataset contains patient records with a rare disease, oversampling

techniques can be generated.

Crash Barrier BBS & QTY
100% (10)
Crash Barrier BBS & QTY
4 pages
Unit 5 Transformation Notes
No ratings yet
Unit 5 Transformation Notes
33 pages
Health Informatics Course - Unit 3.3a - Introduction To Health Care Data Analytics
No ratings yet
Health Informatics Course - Unit 3.3a - Introduction To Health Care Data Analytics
33 pages
HealthCare Analytics - Day 1-5
No ratings yet
HealthCare Analytics - Day 1-5
196 pages
Major1107202x12.2pscslab V4 Approve P11
No ratings yet
Major1107202x12.2pscslab V4 Approve P11
1 page
Data Preprocessing
No ratings yet
Data Preprocessing
57 pages
Healthcare Data Analytics Optimization
No ratings yet
Healthcare Data Analytics Optimization
15 pages
Healthcare Data Analytics Course
No ratings yet
Healthcare Data Analytics Course
2 pages
Kratus 2017 Music Listening Is Creative
No ratings yet
Kratus 2017 Music Listening Is Creative
6 pages
Healthcare Data Analytics Guide
No ratings yet
Healthcare Data Analytics Guide
16 pages
Presentation 32672 Content Document 20250311041135PM
No ratings yet
Presentation 32672 Content Document 20250311041135PM
124 pages
Dos and Donts
100% (1)
Dos and Donts
4 pages
Thesis Updated
No ratings yet
Thesis Updated
151 pages
Preview-9781482232127 A25892874
No ratings yet
Preview-9781482232127 A25892874
76 pages
Introduction
No ratings yet
Introduction
9 pages
Healthcare Data Analytics Guide
No ratings yet
Healthcare Data Analytics Guide
18 pages
3-Artificial Intelligence in Healthcare
No ratings yet
3-Artificial Intelligence in Healthcare
74 pages
Wa0004
No ratings yet
Wa0004
50 pages
FIITJEE Admission Test Broucher
No ratings yet
FIITJEE Admission Test Broucher
76 pages
Unit 5
No ratings yet
Unit 5
26 pages
Presentation SEM
No ratings yet
Presentation SEM
25 pages
Healthcare Analysis
No ratings yet
Healthcare Analysis
30 pages
2024 Wk5 Explorative Data Analysis-1.Ko - en
No ratings yet
2024 Wk5 Explorative Data Analysis-1.Ko - en
51 pages
Healthcare Analytics (Kmba404) 3 Units Notes
No ratings yet
Healthcare Analytics (Kmba404) 3 Units Notes
21 pages
Heart Disease Detection
No ratings yet
Heart Disease Detection
14 pages
Exp1 - Minor
No ratings yet
Exp1 - Minor
2 pages
Shreya Bera BHM
No ratings yet
Shreya Bera BHM
5 pages
UAM1603-HCA Module - 1
No ratings yet
UAM1603-HCA Module - 1
68 pages
Exploring Data Analytics in The Healthcare Industry For Improved Patient Care
No ratings yet
Exploring Data Analytics in The Healthcare Industry For Improved Patient Care
10 pages
HCI - Notes-Ch3
100% (1)
HCI - Notes-Ch3
44 pages
Phase 2
No ratings yet
Phase 2
6 pages
Journal Heart Attack
No ratings yet
Journal Heart Attack
6 pages
Application of Data Science and Bioinformatics in Healthcare Technologies
No ratings yet
Application of Data Science and Bioinformatics in Healthcare Technologies
12 pages
Scribd 5
No ratings yet
Scribd 5
18 pages
Previewpdf
No ratings yet
Previewpdf
288 pages
3 - AI and Machine Learning A Powerful Ally in Enchancing Health Care
No ratings yet
3 - AI and Machine Learning A Powerful Ally in Enchancing Health Care
22 pages
Fraud Detection in Finance Refers To The Process of Identifying and Preven - 20250215 - 153408 - 0000
No ratings yet
Fraud Detection in Finance Refers To The Process of Identifying and Preven - 20250215 - 153408 - 0000
56 pages
Lecture 03 DS Methodology
No ratings yet
Lecture 03 DS Methodology
77 pages
Unit 1-Omd553-Telehealth Technology
No ratings yet
Unit 1-Omd553-Telehealth Technology
53 pages
Vital Role of Data Mining in Healthcare
No ratings yet
Vital Role of Data Mining in Healthcare
3 pages
Health Monitoring and Diagnosis: University College of Engineering, Bit Campus
No ratings yet
Health Monitoring and Diagnosis: University College of Engineering, Bit Campus
21 pages
Mod 3
No ratings yet
Mod 3
22 pages
Ijcns 2022111614325160
No ratings yet
Ijcns 2022111614325160
17 pages
Developing A System For Early Detection of Specific
No ratings yet
Developing A System For Early Detection of Specific
9 pages
Sheet 2
No ratings yet
Sheet 2
2 pages
Health Information Retreival and Evidence Based Practice
No ratings yet
Health Information Retreival and Evidence Based Practice
46 pages
Natural Language Understanding
No ratings yet
Natural Language Understanding
14 pages
Diabetes Prediction Case Study
No ratings yet
Diabetes Prediction Case Study
7 pages
Batch Members (Reg No) : Health Analytics Optimization For Enhanced Patient Care Data
No ratings yet
Batch Members (Reg No) : Health Analytics Optimization For Enhanced Patient Care Data
15 pages
95 843 Xiameter Ofx 0531 Fluid
No ratings yet
95 843 Xiameter Ofx 0531 Fluid
5 pages
Unit 1
No ratings yet
Unit 1
29 pages
DataScienceProcess 14may2019
No ratings yet
DataScienceProcess 14may2019
35 pages
Data Mining and It's Applications in Healthcare
No ratings yet
Data Mining and It's Applications in Healthcare
5 pages
Batch-2 (Review 2)
No ratings yet
Batch-2 (Review 2)
19 pages
NSW Recreational Fishing Catch and Release Handbook
No ratings yet
NSW Recreational Fishing Catch and Release Handbook
33 pages
Prime and Composite Numbers PDF
No ratings yet
Prime and Composite Numbers PDF
6 pages
Healthcare Disparities Data Analytics
No ratings yet
Healthcare Disparities Data Analytics
5 pages
Healthcare Data Analytics Insights
No ratings yet
Healthcare Data Analytics Insights
12 pages
Split System Air Conditioners Manual
No ratings yet
Split System Air Conditioners Manual
20 pages
4 11 Final Modified Chapter-4
No ratings yet
4 11 Final Modified Chapter-4
32 pages
Summary 2
No ratings yet
Summary 2
75 pages
Data Mining in Healthcare: Techniques
No ratings yet
Data Mining in Healthcare: Techniques
9 pages
DhBqO7 - vRayQaju - 71WsBg - Intro To Clinical Data Study Guide - M4
No ratings yet
DhBqO7 - vRayQaju - 71WsBg - Intro To Clinical Data Study Guide - M4
9 pages
Data Collection, Storage and Quality Assurance
No ratings yet
Data Collection, Storage and Quality Assurance
10 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
Predicting Disease With Machine Learning
No ratings yet
Predicting Disease With Machine Learning
20 pages
A Comparative Study of Classification Algorithms For Diseases Prediction in Medical Domain
No ratings yet
A Comparative Study of Classification Algorithms For Diseases Prediction in Medical Domain
5 pages
1 - Introduction To Health Care Data Analytics (Bagian 2)
No ratings yet
1 - Introduction To Health Care Data Analytics (Bagian 2)
31 pages
Energy Auditor Exam Guide
No ratings yet
Energy Auditor Exam Guide
22 pages
Data Mining Techniques For Medical Data A Review PDF
No ratings yet
Data Mining Techniques For Medical Data A Review PDF
12 pages
Meteorological Instruments: MODEL 85000
No ratings yet
Meteorological Instruments: MODEL 85000
16 pages
Ensayo Sobre El Patriotismo
100% (1)
Ensayo Sobre El Patriotismo
6 pages
Health Psychology: Well-Being in A Diverse World Regan A R Gurung Instant Download
100% (1)
Health Psychology: Well-Being in A Diverse World Regan A R Gurung Instant Download
59 pages
Catalogo Juntas Rotativas DEUBLIN
100% (1)
Catalogo Juntas Rotativas DEUBLIN
32 pages
GEZE - Product Data Sheet - EN - 697800130822
No ratings yet
GEZE - Product Data Sheet - EN - 697800130822
3 pages
1 - AI and Machine Learning A Powerful Ally in Enchancing Health Care
No ratings yet
1 - AI and Machine Learning A Powerful Ally in Enchancing Health Care
38 pages
RSettings For 64GT & 99GT PDF
No ratings yet
RSettings For 64GT & 99GT PDF
7 pages
Mechanics of Structure
No ratings yet
Mechanics of Structure
16 pages
Machine Learning - 4
No ratings yet
Machine Learning - 4
23 pages
3rd Module
No ratings yet
3rd Module
5 pages
Clustering Based Virtual Machines Placement in Distributed Cloud Computing
No ratings yet
Clustering Based Virtual Machines Placement in Distributed Cloud Computing
20 pages
Peachtree Charter Middle School: Daily Lesson Plan For Monday
No ratings yet
Peachtree Charter Middle School: Daily Lesson Plan For Monday
3 pages
Journal of Materials Processing Tech.: Harikrishna Rana, Vishvesh Badheka
No ratings yet
Journal of Materials Processing Tech.: Harikrishna Rana, Vishvesh Badheka
13 pages
Unit 1 - What Kind of Movies Have You Been Watching Recently
No ratings yet
Unit 1 - What Kind of Movies Have You Been Watching Recently
12 pages
Electric Battery Recycling in India
No ratings yet
Electric Battery Recycling in India
3 pages
Sheet 1
No ratings yet
Sheet 1
2 pages
La 111 Sessional 2023
No ratings yet
La 111 Sessional 2023
3 pages
Data Science in Healthcare
No ratings yet
Data Science in Healthcare
5 pages
Philmetals 2014 - Rev - Reduced PDF
No ratings yet
Philmetals 2014 - Rev - Reduced PDF
82 pages
Instruction Manual: Sync-Check Relay BE1-25
No ratings yet
Instruction Manual: Sync-Check Relay BE1-25
53 pages
Additive Properties
No ratings yet
Additive Properties
1 page
Material Exploration - Notes Sheet-2
No ratings yet
Material Exploration - Notes Sheet-2
1 page