Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
55 views12 pages

Implementing and Evaluating A Fully Functional AI-enabled Model For Chronic Eye Disease Screening in A Real Clinical Environment

Uploaded by

kalpa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views12 pages

Implementing and Evaluating A Fully Functional AI-enabled Model For Chronic Eye Disease Screening in A Real Clinical Environment

Uploaded by

kalpa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Skevas et al.

BMC Ophthalmology (2024) 24:51 BMC Ophthalmology


https://doi.org/10.1186/s12886-024-03306-y

RESEARCH Open Access

Implementing and evaluating a fully


functional AI-enabled model for chronic
eye disease screening in a real clinical
environment
Christos Skevas1, Nicolás Pérez de Olaguer2, Albert Lleó2, David Thiwa3, Ulrike Schroeter1, Inês Valente Lopes1*,
Luca Mautone1, Stephan J. Linke4, Martin Stephan Spitzer1, Daniel Yap5 and Di Xiao6

Abstract
Background Artificial intelligence (AI) has the potential to increase the affordability and accessibility of eye disease
screening, especially with the recent approval of AI-based diabetic retinopathy (DR) screening programs in several
countries.
Methods This study investigated the performance, feasibility, and user experience of a seamless hardware and
software solution for screening chronic eye diseases in a real-world clinical environment in Germany. The solution
integrated AI grading for DR, age-related macular degeneration (AMD), and glaucoma, along with specialist auditing
and patient referral decision. The study comprised several components: (1) evaluating the entire system solution
from recruitment to eye image capture and AI grading for DR, AMD, and glaucoma; (2) comparing specialist’s grading
results with AI grading results; (3) gathering user feedback on the solution.
Results A total of 231 patients were recruited, and their consent forms were obtained. The sensitivity, specificity,
and area under the curve for DR grading were 100.00%, 80.10%, and 90.00%, respectively. For AMD grading, the
values were 90.91%, 78.79%, and 85.00%, and for glaucoma grading, the values were 93.26%, 76.76%, and 85.00%.
The analysis of all false positive cases across the three diseases and their comparison with the final referral decisions
revealed that only 17 patients were falsely referred among the 231 patients. The efficacy analysis of the system
demonstrated the effectiveness of the AI grading process in the study’s testing environment. Clinical staff involved
in using the system provided positive feedback on the disease screening process, particularly praising the seamless
workflow from patient registration to image transmission and obtaining the final result. Results from a questionnaire
completed by 12 participants indicated that most found the system easy, quick, and highly satisfactory. The study also
revealed room for improvement in the AMD model, suggesting the need to enhance its training data. Furthermore,
the performance of the glaucoma model grading could be improved by incorporating additional measures such as
intraocular pressure.

*Correspondence:
Inês Valente Lopes
[email protected]
Full list of author information is available at the end of the article

© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use,
sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included
in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The
Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available
in this article, unless otherwise stated in a credit line to the data.
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 2 of 12

Conclusions The implementation of the AI-based approach for screening three chronic eye diseases proved effective
in real-world settings, earning positive feedback on the usability of the integrated platform from both the screening
staff and auditors. The auditing function has proven valuable for obtaining efficient second opinions from experts,
pointing to its potential for enhancing remote screening capabilities.
Trial registration Institutional Review Board of the Hamburg Medical Chamber (Ethik-Kommission der Ärztekammer
Hamburg): 2021-10574-BO-ff.
Keywords Artificial intelligence, Telemedicine, AMD, Glaucoma, Diabetic retinopathy, Screening, Digital color fundus
imaging

Background review papers in this area can be referenced, including


As the global prevalence of chronic eye diseases such those by Leng et al. (2023), Paul et al. (2022), and Dong et
as diabetic retinopathy (DR), glaucoma, and age-related al. (2021) on AI for AMD screening using colour fundus
macular degeneration (AMD) continues to rise, early images or OCT images [11–13].
detection and management of these conditions are For glaucoma detection, the early AI models were pri-
increasingly critical. Traditionally, the screening and marily focused on analysing various features such as the
diagnosis of these diseases have relied heavily on manual optic disc and cup-to-disc ratio [14, 15]. Further stud-
inspection and interpretation of retinal images by oph- ies have explored the use of retinal vessel segmentation
thalmologists. However, this process is not only time- and texture analysis to improve the performance of AI-
consuming and resource-intensive but also susceptible to based glaucoma detection systems [16, 17]. In a system-
inter-observer variability and human error. atic review and meta-analysis conducted by Buisson et al.
In recent years, artificial intelligence (AI), and more (2021), deep learning models demonstrated similar per-
specifically deep learning, has emerged as a powerful formance to ophthalmologists in diagnosing glaucoma
tool to revolutionize the field of ophthalmology. AI algo- from fundus examinations [18]. The reviews (Atalie 2020,
rithms have demonstrated high performance in the auto- Yousefi 2023) discuss its potential to improve diagnostic
mated grading the severity of DR, AMD, and glaucoma capabilities but also point out its challenges and the need
using retinal images. These advancements have not only for careful validation in clinical practice [19, 20].
shown the potential to enhance diagnostic accuracy and Despite the promising research and developments,
efficiency but also to reduce the burden on healthcare their translation into real-world clinical practice remains
systems and improve patient outcomes. a challenging endeavour. To date, there has been a grad-
Recent advances in AI have revolutionized the field ual deployment of AI models in software systems for
of DR grading using retinal images. Early publications DR screening over the past six years. Several AI-based
in 2016 by Gulshan et al. showed the effectiveness of a screening systems, such as IDx-DR, Thirona Retina, Ret-
deep learning algorithm for detecting referable DR from marker, EyeArt, iGradingM, Eyetelligence Assure, Reti-
colour fundus photographs with high sensitivity and nalyze, TeleEye MD, Airdoc-AIFUNDUS, and SELENA+,
specificity. Subsequent studies by Abràmoff et al. (2018), have been published and deployed in various clinical
Ting et al. (2017), and Gargeya and Leng (2017) demon- settings.
strated similar performance [1–4]. In a recent study, Li et IDx-DR was validated on 900 patients in primary
al. (2022) developed a deep ensemble algorithm capable care sites in the USA, achieving a sensitivity of 87.2%
of detecting both diabetic retinopathy (DR) and diabetic and a specificity of 90.7% on the 819 participants that
macular edema (DME) [5], which exhibited performance were analysable [1]. Based on these results, it gained
that was comparable to or even surpassed that of oph- the FDA certificate for use by healthcare providers
thalmologists. Several recent review papers by Sebastian for automatically detecting more than mild diabetic
(2023), Tsiknakis (2021), and Dubey (2023) offer compre- retinopathy (mtmDR). IDX-DR was also validated in
hensive insights into this rapidly evolving area and can be the Hoorn Diabetes Care System in the Netherlands,
referenced for further exploration [6–8]. achieving a sensitivity of 91% and a specificity of
AI has also shown promising results in detecting and 84% for detecting referable DR [21]. In a pilot study,
classifying AMD severity from retinal images. Burlina et IDx-DR showed a higher percentage agreement with
al. (2018), Ting et al. (2017), and Peng et al. (2019) showed human ophthalmologists in both DR positive and DR
that deep-learning models could achieve higher accuracy negative cases, suggesting it may be more reliable for
in the automated classification of patient-based AMD autonomous screening [22].
severity using bilateral colour fundus photographs and EyeArt was validated in more than 30,000 patients
outperformed retinal specialists [3, 9, 10]. Several recent in the English Diabetic Eye Screening Programme in
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 3 of 12

the UK, achieving a sensitivity of 95.7% and a specific- To address these gaps, this study aims to implement
ity of 54.0% for referable retinopathy [23]. In another and evaluate a fully-integrated hardware and software
validation in the USA,it achieved similar sensitivities solution and an automatic workflow in a real clinical
and specificities for detecting both mtmDR (95.5% environment. By evaluating the system’s performance,
sensitivity and 85.3% specificity) and vision-threaten- diagnostic performance, and user experience, we hope
ing diabetic retinopathy (VTDR, 95.2% sensitivity and to shed light on the feasibility, acceptability, and accu-
89.5% specificity) on 893 patients [24]. SELENA + was racy of AI-assisted chronic eye disease screening. Fur-
validated on 1574 patients in a mobile screening pro- thermore, we aim to understand the challenges and
gram in Zambia, achieving a sensitivity of 92.25% and opportunities associated with the real-world deploy-
specificity of 89.04% for referable DR and a sensitiv- ment of AI in ophthalmic disease screening and man-
ity of 99.42% for VTDR and a sensitivity of 97.19% for agement. To the best of our knowledge, this study
DME [25]. In a small-cohort pilot study on tele-oph- represents one of the pioneering attempts to incorpo-
thalmology, it demonstrated 100% referral accuracy rate screening for the three chronic diseases within an
for known 9 diabetic retinopathy patients among 69 integrated retinal imaging and AI-grading system sup-
validation patients [26]. An offline AI-based DR grad- ported by a cloud solution.
ing system was validated on populations (236 par-
ticipants) from two endocrinology outpatients and Methods
three Aboriginal Medical Service clinics in Australia, The study uses a cloud-based tele-ophthalmological
achieving a sensitivity of 96.9% and a specificity of platform (TeleEye MD) combined with a retinal cam-
87.7% for detecting referable DR [27]. The VoxelCloud era (DRS Plus, Icare Finland Oy, Finland) and a data
Retina was validated on 15,805 patients at 155 diabe- transmission device. The study focused on several
tes centres in China, achieving an 83.3% sensitivity studying points:
and a 92.5% specificity to detect referable DR [28]. In
a recent study, an AI system RAIDS that can detect 1) Workflow for chronic eye disease screening from
seven eye conditions, including common abnormali- patient retinal image capture till report generation;
ties like DR, ARMD, and glaucoma, was validated in 2) AI-assisted grading for DR, AMD and glaucoma;
real-world clinical settings [29]. The system achieved 3) Human grader’s audit based on the AI grading
sensitivities and specificities for DR, ARMD, and refer- results;
rable glaucoma of 83.7% and 88.1%, 81.3% and 98.6%, 4) Feedback from patients and health professionals;
and 97.6% and 95.0%, respectively. 5) System efficacy.
While the abovementioned systems have been vali-
dated and approved for clinical use, there is limited
information available regarding their real-world per- Participants
formance and acceptance among end-users, indicating The patients included in this prospective study were
a need for further research in this area. Only the study recruited at the Ophthalmology Outpatient Depart-
of the offline AI-based DR grading system [27] investi- ment of medical retina and glaucoma, of the University
gated the experience and acceptance, which is the first Medical Center Hamburg-Eppendorf, Germany. The
for an AI system to complete the accuracy analysis and medical staff approached the patients and explained
the system’s end-user experience analysis. Conversa- the goals of the study and the examinations that had
tions regarding the influence of socio-environmental to be performed. After informed consent was given by
factors on deep learning model performance have been all patients, a trained study nurse (US) who was hired
relatively scarce. A noteworthy contribution to this to support the study, performed patient examinations
discourse comes from Beede et al. at Google Health using the screening system. All patients recruited
who conducted a human-centered observational study in the study completed the necessary examinations.
on a deep learning system in clinical care in Thailand The study patients shared the same premises and the
[30]. Their research highlighted the impact of end- same hardware as all other patients. The recruitment
users and environmental factors, including lighting occurred from December 2021 to October 2022.
conditions, patient expenses, and model threshold set-
tings, etc. It underscores the urgency to develop meth- Ethics and inclusion criteria
odologies for designing and evaluating deep learning This prospective study was registered and approved
systems in clinical settings, emphasizing collaboration by the Ethics Review Board of the Medical Associa-
with the Human-Computer Interaction (HCI) commu- tion of Hamburg (process number: 2021-10574-BO-ff )
nity [31, 32]. and follows the recommendations of the Declaration
of Helsinki. The patients could withdraw from the
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 4 of 12

Fig. 1 Hardware configuration for the screening service

Fig. 2 Screening workflow and software configuration

study at any time by informing the supervisors. Inclu- AMD and glaucoma. These AI-graded images were sub-
sion criteria were an age of at least 18 years and eyes in sequently managed by the cloud system and subjected to
which clear media allowed a sharp fundus photo. audit by the study’s specialist (CS).
The patients performed the following examinations:
Screening workflow objective refraction, non-contact eye pressure measure-
Patients underwent an ocular examination utilizing a dig- ment, best corrected visual acuity (BCVA) and after
ital colour fundus imaging device. The process was facili- pupillary dilation, fundus photography of the retina
tated by a study nurse using a patient registration and (optic disc, macula and retinal periphery), and anterior
data transmission tool - the bridging device. To maintain chamber photos focused on the lens.
confidentiality, unique project IDs were assigned to each The following picture (Fig. 1) illustrates the hardware
patient’s data, which were then transferred to the cloud- and its configuration used in the workflow.
based system. This system, powered by an integrated Figure 2 depicts the workflow of the screening process.
AI, graded the patients’ colour fundus images for DR, It consists of the following steps:
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 5 of 12

1. The study nurse inputs the recruited patient’s options for DR, AMD, and glaucoma levels, as well as
information, including the patient’s project ID, lens opacity status, and the grading results would be
ethnicity, age and gender, on the bridging device and reflected in the final report.
transfers them into the cloud system. However, the AI grading process only provided refer-
2. The study nurse guides the patient into position able or non-referable results for DR, AMD, and glau-
for eye image capture using the DRS Plus camera. coma, where “referable DR” indicated more than ‘Mild
Once the patient is properly positioned, the study NPDR’, and “referable AMD” indicated any condition
nurse activates the camera and captures an image more severe than ‘Early AMD’. If the configuration of
of the patient’s retina. One macula-centered image the optic nerve head was suspected, then referral to an
(45-degree field of view) per eye is captured. The ophthalmology specialist was advised. If the image is
image is instantly transmitted to the bridging device. ungradable judged by the AI grading, the patient should
The bridging device uses its cloud service to analyze be considered referred.
the image and provides a quality assessment (QA)
score within seconds. If the QA score is “inadequate,” Automatic AI grading algorithms and platform settings
the bridging device will notify the study nurse on the The DR grading method was described in an early paper
screen, prompting them to recapture the image. with subsequent improvements made afterwards [34].
3. After the QA process, on the bridging device, the The development of the AMD grading model was based
study nurse selects one or two macula-centered on an EfficientNet deep learning backbone with custom-
images from each eye and submits them to the ized classification layers. The model was trained and
patient’s cloud account for DR, AMD, and glaucoma validated on 4,218 images and achieved a specificity of
gradings by the AI. 95.23% and a sensitivity of 98.14%. The development of
4. An auditor (CS) with an auditor account (i.e., an the glaucoma grading algorithm was based on Efficient-
ophthalmologist), can log into the cloud system and Net deep learning backbone with customized classifica-
view the patient’s colour fundus images from both tion layers. 3,2828 images were used for the deep learning
eyes with their raw resolutions and select human model’s training and validation. The model achieved a
grading options for DR, AMD, and glaucoma. Then, sensitivity of 91.48% and a specificity of 92.94%.
a final report can be generated. The three AI models were integrated into the cloud sys-
5. The study nurse has access to the cloud web portal to tem by utilising the Lambda Service approach provided
check patiens’ report readiness. Once the final report by Amazon AWS. Besides the AI grading and human
is ready, the study nurse can download the report auditing, the cloud system also provides the functions of
from the platform. clinic management and patient health data management.
System testing and staff training.
Prior to the patient recruitment process, the engineer-
Disease grading protocol ing team meticulously tested both the hardware con-
This study’s disease grading protocols were established figuration and the comprehensive screening workflow.
on the foundations of the International Clinical Diabetic Two demonstrations were conducted by the engineer-
Retinopathy Disease Severity Scale and the International ing team to the clinical staff in the study. The screening
Clinical Diabetic Edema Disease Severity Scale, used organization account and the user accounts for screen-
for DR and DME respectively [33]. DR grading ranged ing study nurses, managers and auditors were created
from the grading levels ‘No Apparent Retinopathy’, ‘mild and the hardware and software user manuals and train-
Non-Proliferative DR (NPDR)’, ‘moderate NPDR, severe ing materials were provided. The training for the clinical
NPDR’, and ‘proliferative DR’. DME grading was catego- staff was provided by the two trainers (NO & AL). The
rized as ‘Diabetic Macular Edema Absent’, progressing training consisted of how to use the bridging device for
through ‘Mild’, ‘Moderate’, and ‘Severe’ DME. In the case patient registration and how to use the DRS Plus camera
of AMD, the grading commenced from ‘No Apparent for image capture, as well as checking image quality and
AMD’, escalating to ‘Early’, ‘Intermediate’, and ‘Advanced’ exam submission for the imaging study nurse. The audi-
AMD. Glaucoma grading was simplified to ‘Referable’ tor (CK) was trained on how to log in and use the audit-
(suspect glaucoma) or ‘Non-Referable’ based solely on ing page in the cloud system. Grading protocols were
image analysis of the optic nerve head. No further anal- discussed and conformed to. Following the training, the
ysis has been performed e.g. IOP (intraocular pressure) staff ’s utilization and proficiency with the system were
measurement or nerve fibre layer analysis. Lens opacity overseen until they demonstrated independent capability
was evaluated as ‘Normal’, ‘Non-Significant Media Opac- in its operation.
ity’, and ‘Significant Media Opacity’. During the auditing The study nurse performing the examination was also
process, the auditor could select the appropriate grading trained to explain the project information sheet and
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 6 of 12

patient consent form to recruited patients. Patient ques- calculated to evaluate the grading accuracy of the three
tionnaire forms were provided to patients to collect diseases.
patient feedback for the AI screening service.
Participant feedback analysis
Data analysis Following the acquisition of images and the initial grad-
System efficacy analysis ing by the AI, patients were invited to participate in a
The efficacy of the AI-based eye disease screening sys- comprehensive survey aimed at capturing their experi-
tem was analyzed by addressing several critical aspects. ences and attitudes towards the AI-based eye disease
Firstly, real-time QA was evaluated during the image screening system. To gather this valuable feedback, a
acquisition process. Secondly, the AI grading procedure questionnaire consisting of four questions was adminis-
was assessed. The images were analyzed using cloud com- tered (Table 1).
puting to produce three different diseases’ grades, which
were then shown on the bridging device. The promptness Results
in the generation and availability of these results to the A total of 231 patients were recruited to the study with
clinicians was analyzed. The final aspect evaluated was 125 being male and the remaining 106 being female. This
the time required for a human auditor to review a single gives a gender ratio of approximately 1.18 males to every
case. female.
The age summary in the dataset showed a mean age
Disease grading analysis of approximately 63 years old. The standard deviation is
All de-identified data were exported from the AWS cloud 16.9, indicating that the ages are fairly spread out. The
server and imported into an Excel spreadsheet for further youngest individual in the dataset is 19 years old, while
analysis. the oldest is 95. The median age is 66, and the interquar-
The exported data include three main parts: (1) general tile range (IQR) is from 54.5 to 76.0 years old. It is worth
patient information recorded; (2) AI grading results: AI- noting that all 231 patients in the dataset self-identified
based gradings for left/right eye identification, image QA, as Caucasian.
and AMD, DR, Glaucoma three diseases finally; (3) audit-
ing results: auditor’s grading for DR levels, DME levels, Gradable patients and images
AMD levels, glaucoma normal/suspect, image ungrad- Out of the 231 patients initially assessed, four cases had
able/gradable, and lens opacity, auditor comment, follow- both eyes ungradable according to the auditor, a find-
up screening period, and ophthalmologist referral period. ing which was agreed upon by the AI. Interestingly, in
Sensitivity, specificity, positive predictive value one patient, the AI determined both eyes to be ungrad-
(PPV), negative predictive value (NPV) and AUC were able, but this assessment was not shared by the audi-
tor. Among the remaining patients, eight had a single
Table 1 Questions in the questionnaire form for participants ungradable eye according to the auditor, and of these, the
Questions Selections* AI agreed with six of the assessments. On the other hand,
How satisfied were you with your experience (1) Very dissatisfied; the AI determined that 11 patients had a single ungrad-
with the AI-based eye disease screening system? (2) Dissatisfied; (3) able eye, but the grader disagreed with this assessment.
Please explain the reason you gave the score. Neutral (4) Satisfied; These findings suggest that there may be discrepancies
(5) Very satisfied
between the AI and human graders in assessing ungrad-
Were you satisfied with the time consumption for (1) Very dissatisfied;
completing your eye disease screening? Please (2) Dissatisfied; (3) able eyes in patients. The ungradable images include
explain the reason you gave the score. Neutral (4) Satisfied; “image half blurry”, “image half dark” and “significant
(5) Very satisfied media opacity” etc. situations (Fig. 3).
Did you use the web portal’s mobile app for your (1) So difficult; (2)
registration? If “Yes”, was it simple to use the app? Difficult; (3) Neutral Disease distribution
Please explain the reason you gave the score. (4) Easy; (5) So easy
In the auditing process, our online auditing page enabled
(note: this question is not applicable for the study)
the auditor to choose the ‘ungradable’ option for each of
Considering your complete experience with (1) Not at all; (2)
our medical facility, how likely would you be to Not recommend- the three diseases independently.
recommend us to a friend or colleague? Please able; (3) Neutral Among the 231 patients assessed by the auditor, a total
explain the reason you gave the score. (4) Recommend- of 27 patients were identified as referable DR patients, 33
able; (5) Very patients were identified as referable AMD patients, and
recommendable
89 patients were classified as referable (suspect) glau-
Free comments and suggestions.
coma cases.
*For each question, patients were instructed to select only one option from the
five provided, with each option assigned a score ranging from 1 (lowest rating) In terms of ophthalmologist referrals, as determined
to 5 (highest rating) based on the corresponding item number. by the auditor, 65 patients only needed to undergo
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 7 of 12

Fig. 3 Examples of ungradable images

Table 2 DR AI grading accuracy summary Table 4 Glaucoma AI grading accuracy summary


Statistic Value 95% CI Statistic Value 95% CI
Sensitivity 100.00% 87.23–100.00% Sensitivity 93.26% 85.90–97.49%
Specificity 80.10% 73.98–85.32% Specificity 76.76% 68.94–83.43%
Positive Predictive Value * 37.33% 25.90–49.91% Positive Predictive Value * 7.50% 3.43–13.89%
Negative Predictive Value * 100.00% 97.79–100.00% Negative Predictive Value * 99.82% 96.49–100.00%
AUC 0.90 87–93% AUC 0.85 81–89%
* These values are dependent on German disease prevalence 10.60% * These values are dependent on German disease prevalence 1.98%

Table 3 AMD AI grading accuracy summary Table 5 Ophthalmologist referral decided by the auditor for the
Statistic Value 95% CI patients under the false positive cases
Sensitivity 90.91% 75.67–98.08% Ophthalmologist Referral Decision from Auditor
Specificity 78.79% 72.43–84.26% False Positive DR False Positive AMD False Positive
Positive Predictive Value * 29.26% 19.12–41.16% Glaucoma
Negative Predictive Value * 98.90% 95.78–99.89% Period Num- Period Num- Period Num-
AUC 0.85 79–90% ber of ber of ber of
* These values are dependent on German disease prevalence 8.80% Patients Patients Patients
Within 1 0 Within 1 1 Within 1 1
week week week
regular screening without referrals. However, 149 Within 4 4 Within 4 3 Within 4 3
patients needed to be referred to ophthalmologists weeks weeks weeks
within 3 months, 12 patients required appointments Within 3 33 Within 3 33 Within 3 17
within 4 weeks, and 5 patients needed urgent referrals months months months
within 1 week. Not 4 Not 5 Not Re- 12
Required Required quired

AI grading performance
The performance of the three models for grading DR, that the PPV and NPV can be influenced by the disease
AMD, and glaucoma diseases was compared with the prevalence value used. Considering that the sensitivity
auditor’s assessment results. In the context of disease values for all three disease gradings are above 90%, with
screening and patient referral, the evaluation of the only a few misclassified cases, the results are presented
performance of the three models was conducted at the here in the three tables (Tables 2, 3 and 4). Table 5 pres-
patient level. The initial step in the workflow involved ents the statistical data of the false positive classifications
left/right eye identification based on colour fundus under each disease and their “ophthalmologist referral”
images. Given that these images are macula-centred and decisions from the auditor’s assessment based on other
readily distinguishable, the left/right eye identification abnormal findings. The data contribution to the s miss
model achieved 100% accuracy. When calculating sensi- classifications and their final referral decisions will bedis-
tivity and specificity, any eyes deemed ‘ungradable’ by the cussed in the discussion section.
auditor for a specific disease grading were categorized as
‘referable’ cases for that disease. Conversely, if an image System efficacy
or eye is assessed as ‘ungradable’ by the QA model but is, Table 6 presents the performance metrics of four inde-
in fact, considered gradable and normal by a human, it pendent AWS Lambda services used in this study. The
is counted as a false positive case. It is important to note QA Lambda service is specifically employed during the
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 8 of 12

Table 6 Processing time of the deep learning models in the five comments received in response to this question all
AWS cloud emphasized the speed of the process, with phrases such
AWS Lambda Service Process Time (ms)* Memory allo- as “Quick” or “Fast.” All 12 respondents rated their “likeli-
cation (MB)
hood of recommending the facility” (question 4) as 4 or
QA 1000 1536
higher, resulting in an average rating of 4.33. Only two
DR grading 2000 1536
participants provided comments regarding their impres-
AMD grading 1500 1024
sions of the service. Both expressed a positive sentiment
Glaucoma grading 4000 4096
*Average process time of the warmed-up lambda services
and referred to the intervention as “innovative.” The low
number of patients who filled out the questionnaire can
be attributed to fatigue after spending a number of hours
image quality assessment step, while the remaining three in the clinic.
Lambda services are concurrently triggered during the
image grading step, enabling simultaneous execution. Discussion
The average process time, measured after the services’ In Germany, the scope of patient screening for chronic
warm-up periods, represents the duration required for eye disease within the general health insurance system
each Lambda service to complete its execution. Addi- is currently limited to screening for DR in patients with
tionally, the memory allocation for each service is deter- confirmed diabetes mellitus. This process requires a for-
mined based on the structural characteristics of the deep mal referral from a general practitioner to an ophthal-
learning models employed. mologist, who then carries out the examination. Despite
Based on the user observations and experiences, sev- these measures, it is noteworthy that only an average
eral other important parameters reflect the efficacy of the of 50% of diabetic patients adhere to the recommended
system too: ophthalmologist visits [35]. The screening for diabetic
retinopathy is a solitary process, with no other healthcare
•  Registering a patient takes an average of 45 s. professionals involved. The role of fundus photography
•  After capturing an image, the QA result is typically grading, whether by professional graders or AI, is yet to
available in 6 s for it. be recognized in the German healthcare system [36, 37].
•  When submitting two images to obtain results, the This study established a clinical setting wherein a
process takes an average of 46 s for uploading and healthcare provider could perform chronic eye disease
grading the images. Alternatively, grading images screening using a combination of hardware and software
only takes an average of 30 s. solution. Additionally, the study explored methods for
•  Auditing time for one case by the auditor is less than enabling remote screening audits involving specialists.
5 min. To the best of our knowledge, this study represents one
of the pioneering attempts to incorporate screening for
diseases DR, AMD, and glaucoma within a single system.
Participant feedback In terms of the measured data, the DR model demon-
After completing the examinations for the study, patients strated remarkable performance by accurately detecting
were kindly asked to give us feedback by completing a all patients with DR disease, achieving a sensitivity of
questionnaire. A total of 12 questionnaire forms were 100%. Out of the 29 gradable AMD patients, the AMD
collected from the participants to assess their feedback model successfully identified 26 patients. The remain-
on the screening system. Regarding the question on ing three patients were not detected, including one with
“satisfaction with the screening system,” the average rat- “RPE Defects of the macula,” one with “choroidal nevus
ing score was 4.00 on the rating scale from 1 (lowest) located superior to the optic nerve” and one patient with
to 5 (highest). Out of the 12 patients, 4 selected “neu- AMD in one eye. As for glaucoma grading, excluding the
tral,” 4 selected “satisfied,” and 4 selected “very satisfied.” ungradable patients, a total of 85 patients were classified
Among the seven comments received for this question, as “suspect glaucoma” and required referral according to
five patients expressed that the system was “Easy and the auditor’s assessment. The glaucoma model accurately
quick” or simply “Easy.” One patient mentioned discom- identified 78 out of these 85 patients.
fort, stating that the device was “too tight in the mouth The specificities of DR, AMD, and glaucoma gradings
plus nose area,” which was related to the camera usage. appear to be relatively low compared to their sensitivities.
Another patient mentioned having “no knowledge about However, upon closer investigation, several phenomena
the results.” For the question regarding “satisfaction with and facts emerge to explain these observations.
the time consumed for completing disease screening,” the Regarding DR, among the false positive gradings, it
average rating was 4.08. Three patients chose “neutral,” was observed that 90% of patients required “ophthal-
five chose “satisfied,” and four chose “very satisfied.” The mologist referral” according to the auditor. This suggests
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 9 of 12

that the DR model detected certain fundus images with to differentiate as distinct diseases. Additionally, we
abnormalities resembling DR lesions or patterns similar observed that certain images displaying AMD or DR fea-
to those, for instance, ten patients were identified with tures influenced the accurate classification of the two dis-
“referable AMD”. The others present with abnormalities eases. In the case of glaucoma, as it relies solely on the
such as “peripheral bleeding” and “laser scars”, etc. Col- image data around the disc region without considering
lectively, these various abnormalities contributed to a measures such as intraocular pressure (IOP) and visual
final false referral rate of only 10% among the false posi- field, its sensitivity and specificity are relatively low.
tive DR patients. Table 7 presents a comparative analysis of the grad-
In the case of AMD, the false positive gradings exhib- ing performance of the systems introduced in the back-
ited a similar pattern to that of the false positive DR groundsection of this paper. It is worth mentioning that
gradings. Among these, 88% of the patients required an RAIDS not only serves the same function as our system
“ophthalmologist Referral”. This indicates that the AMD in grading three diseases but also extends its capabilities
model identified certain eye images displaying abnor- to identify other eye abnormalities. In contrast, the trials
malities that resembled AMD lesions or patterns. Inter- conducted for the other systems were exclusively centred
estingly, seven patients were classified as “referable” DR on grading DR.
patients, suggesting that the presence of DR features As previously mentioned, the factors related to human-
influenced the AMD model’s grading. Additionally, other computer interaction can influence the performance
abnormalities such as “retinal scars"and “epiretinal mem- of AI applications in real-world clinical settings [27]. In
brane”, etc.were observed. These various abnormalities our study, we conducted preliminary investigations into
contributed to a final false referral rate of only 12%. these factors. We observed that only 5 out of the patients
In the context of Glaucoma, even among the false posi- with both eyes were deemed ungradable. One contribut-
tive gradings, a substantial number (63.65%) necessitated ing factor to this limited ungradability could be attrib-
an “ophthalmologist referral” In the remaining cases (12 uted to the well-established imaging room setup within a
patients), no additional abnormalities were detected, hospital environment and the generally high image qual-
except for one patient who exhibited “minor RPE defects”. ity produced by the fundus camera. Regarding the user
Furthermore, when combining all false positives (84 experience of clinical staff, the feedback from the study
patients),, we found that only 17 patients (20%) needed to nurse indicated that the patient registration device and
follow the annual screening. All the remaining 67 patients the fundus camera were user-friendly and easy to operate
needed to be referred to ophthalmologists within a time- for patient information input and image capture. Con-
frame ranging from one week to three months. This sidering the seamless operation of the camera and the
analysis suggests that the AI system did not significantly bridging device, we did not observe any substantial influ-
increase the rate of incorrect referrals in the clinical trial ence from other factors such as data transmission speed,
for the three-disease screening. the image acquisition and analysis workflow on the dis-
Based on the analysis conducted, it is evident that ease grading performance. The system received positive
the misgrading of DR and AMD, leading to false posi- feedback from the medical staff involved in the auditing
tives, was primarily due to the presence of other abnor- process. All these underscore the significance of a well-
malities that the models might not have been trained designed human-computer interaction in clinical AI

Table 7 Comparison of the performance of referable DR grading from different systems in clinic settings
IDx-DR EyeArt SELENA+ Offline AI Voxel- RAIDS TeleEye MD
(DR) (DR) (DR) System* Cloud (DR, AMD, and Glaucoma) (DR, AMD, and
(DR) Retina Glaucoma)
(DR)
Validated on 900 patients EyeArt v2.1 Validated on 1574 Validated Validated Validated on 110,784 par- Validated on 231
SE: 87.2%; Validated on 30,405 patients on 236 on 15,805 ticipants from 65 healthcare Patient in this study
SP: 90.7% patients in the in a mobile partici- patients centers in China DR
IDx-DR 2.0 English Diabetic Eye screening pro- pants in at 155 DR SE: 100.0%
Validated on 1415 patients Screening Pro- gram in Zambia Australia diabetes SE: 83.7% SP: 80.1%
in the Hoorn Diabetes Care gramme in the UK SE: 92.3% SE: 96.9% centres in SP: 98.6% AMD
System SE: 95.7% SP: 89.0% SP: 87.7% China AMD SE: 90.9%
SE: 91% SP: 54.0% Validated on SE: 83.3% SE: 88.1% SP: 78.8%
SP: 84% Validated on 893 69 patients in SP: 92.5% SP: 97.6% Referral possible
(Using EURODIAB criteria) patients Australia Referral possible glaucoma glaucoma
SE: 95.5% SE: 96.9% SE: 81.3% SE: 93.2%
SP: 85.3% SP: 87.7% SP: 95.0% SP: 76.8%
* From the Centre for Eye Research, Australia
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 10 of 12

NPDR Non proliferative diabetic retinopathy


applications, as it can enhance both user experience and NPV Negative predictive value
diagnostic accuracy. OCT Optical coherence tomography
However, we observed and experienced several limi- PDRP Proliferative diabetic retinopathy
PPQ Positive predictive value
tations that must be taken into account when interpret- QA Quality assessment
ing our results. Firstly, the sample size of our study was VTDR Vision-threatening diabetic retinopathy
small, which may impact the accuracy of the three AI
Acknowledgements
models we tested. Additionally, the ethnic background Not applicable.
of our participants was limited to Caucasians only. We
used only one camera in our study. Furthermore, we Author contributions
CS: Writing– review & editing, design of research; NO: Writing– review &
only collected feedback from a limited number of par- editing, data collection and management; AL: Writing– review & editing, data
ticipants, with only 12 forms collected. The grading and collection and management; DT: Data collection, writing– review & editing;
severity of diseases we assessed were based solely on the US: Data collection; IL: Data collection, writing– review & editing; LM: Writing–
review & editing; SL: Writing– review & editing; MSS:Writing– review & editing;
colour fundus images, and other imaging modalities or DY: Hardware configuration, design of research; DX: Manuscript writing, design
measurements, such as OCT or IOP, were not utilized to of research, supervision of research.
determine the ground truth. Looking forward, the devel-
Funding
opment and implementation of operator-independent This project was supported by the Hamburgische Inversitions- und
methods show promise. This includes the further explo- Förderbank (IFB): Project number: 5116082.
ration of smartphone-based visual field measurement Open Access funding enabled and organized by Projekt DEAL.

tools and the utilization of virtual reality headsets, both Data availability
of which hold potential for enhancing glaucoma screen- The datasets generated and/or analysed during the current study are not
ing in primary care settings. We hope that future studies publicly available due to patient data protection but are available from the
corresponding author on reasonable request.
can build upon our findings to address these limitations
and further advance this field.
Declarations
Conclusion Ethic approval
The implementation of the AI-based approach for All procedures performed in studies involving human participants were
in accordance with the ethical standards of the Institutional Review Board
screening three chronic eye diseases has demonstrated of the Hamburg Medical Chamber (Ethik-Kommission der Ärztekammer
its effectiveness in real-world settings, especially when Hamburg: 2021-10574-BO-ff ) and with the 1964 Helsinki declaration and its
comparing the individual disease’s referral decisions and later amendments or comparable ethical standards. Informed consent was
obtained from all individual participants included in the study.
their combined referral rate. Both the screening staff and
auditor have expressed positive feedback regarding the Consent for publication
ease of use of the hardware and software platform. The Not applicable, since any individual person’s data in any form are publicated in
this manuscript.
incorporation of an auditing function has proven valu-
able for obtaining timely second opinions from experts, Competing interests
potentially applicable for facilitating remote screening. Nicolas Pérez de Olaguer, Albert Lleó are employed by TeleMedC GmbH,
Germany. Daniel Yap is employed by TeleMedC Ptd Ltd, Singapore. Para
The detection of multiple eye diseases carries signifi- Segaram and Di Xiao are employed by TeleMedC Pty Ltd, Australia.Christos
cant importance due to their potential to cause visual Skevas, David Thiwa, Ulrike Schroeter, Inês Valente Lopes,Luca Mautone,
impairment, coupled with their increasing prevalence. Stephan Linke, Martin Stephan Spitzer: None declared.

Considering the ongoing global advancements in tech- Author details


nology and the evolving demographic landscape in 1
Department of Ophthalmology, University Medical Center Hamburg -
Germany, the integration of AI into disease screening Eppendorf, Martinistr. 52, 20249 Hamburg, Germany
2
TeleMedC GmbH, Raboisen 32, 20095 Hamburg, Germany
approaches emerges as a logical and necessary progres- 3
Department of Otorhinolaryngology, University Medical Center
sion. Continued research and development in this field Hamburg - Eppendorf, Martinistr. 52, 20249 Hamburg, Germany
4
will further refine the accuracy and effectiveness of AI Zentrum Sehestaerke, Martinistraße 64, 20251 Hamburg, Germany
5
TeleMedC Pty Ltd, 61 Ubi Avenue 1, #06-11 UBPoint, Singapore
systems, ultimately benefiting individuals affected by 40894, Singapore
these chronic eye diseases. 6
TeleMedC Pty Ltd, Brisbane Technology Park, Level 2, 1 Westlink Court,
Darra QLD 4076, Australia
Abbreviations
AI Artificial intelligence Received: 2 July 2023 / Accepted: 16 January 2024
AMD Age-related macular degeneration
AUC Area under the receiver operating characteristics curve
DME Diabetic macular edema
DR Diabetic retinopathy
IOP Intra-Ocular Pressure References
IQR Interquartile range 1. Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autono-
mtmDR More than mild diabetic retinopathy mous AI-based diagnostic system for detection of diabetic retinopathy in
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 11 of 12

primary care offices. NPJ Digit Med [Internet]. 2018;1:39. Available from: 18. Buisson M, Navel V, Labbé A, Watson SL, Baker JS, Murtagh P et al. Deep learn-
http://www.ncbi.nlm.nih.gov/pubmed/31304320. ing versus ophthalmologists for screening for glaucoma on fundus examina-
2. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A et al. tion: A systematic review and meta-analysis. Clin Experiment Ophthalmol
Development and Validation of a Deep Learning Algorithm for Detection [Internet]. 2021;49(9):1027–38. Available from: http://www.ncbi.nlm.nih.gov/
of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA [Internet]. pubmed/34506041.
2016;316(22):2402–10. Available from: http://www.ncbi.nlm.nih.gov/ 19. Thompson AC, Jammal AA, Medeiros FA. A Review of Deep Learning for
pubmed/27898976. Screening, Diagnosis, and Detection of Glaucoma Progression. Transl Vis Sci
3. Ting DSW, Cheung CY-L, Lim G, Tan GSW, Quang ND, Gan A et al. Develop- Technol [Internet]. 2020;9(2):42. Available from: http://www.ncbi.nlm.nih.gov/
ment and Validation of a Deep Learning System for Diabetic Retinopathy pubmed/32855846.
and Related Eye Diseases Using Retinal Images From Multiethnic Populations 20. Yousefi S. Clinical Applications of Artificial Intelligence in Glaucoma. J Oph-
With Diabetes. JAMA [Internet]. 2017;318(22):2211–23. Available from: http:// thalmic Vis Res [Internet]. 2023;18(1):97–112. Available from: http://www.ncbi.
www.ncbi.nlm.nih.gov/pubmed/29234807. nlm.nih.gov/pubmed/36937202.
4. Gargeya R, Leng T. Automated Identification of Diabetic Retinopathy Using 21. van der Heijden AA, Abramoff MD, Verbraak F, van Hecke MV, Liem A, Nijpels
Deep Learning. Ophthalmology [Internet]. 2017;124(7):962–9. Available from: G. Validation of automated screening for referable diabetic retinopathy with
http://www.ncbi.nlm.nih.gov/pubmed/28359545. the IDx-DR device in the Hoorn Diabetes Care System. Acta Ophthalmol
5. Li F, Wang Y, Xu T, Dong L, Yan L, Jiang M et al. Deep learning-based auto- [Internet]. 2018;96(1):63–8. Available from: http://www.ncbi.nlm.nih.gov/
mated detection for diabetic retinopathy and diabetic macular oedema in pubmed/29178249.
retinal fundus photographs. Eye (Lond) [Internet]. 2022;36(7):1433–41. Avail- 22. Grzybowski A, Brona P. Analysis and Comparison of Two Artificial Intelligence
able from: http://www.ncbi.nlm.nih.gov/pubmed/34211137. Diabetic Retinopathy Screening Algorithms in a Pilot Study: IDx-DR and
6. Sebastian A, Elharrouss O, Al-Maadeed S, Almaadeed N. A Survey on Deep- Retinalyze. Journal of Clinical Medicine [Internet]. 2021;10(11):2352. https://
Learning-Based Diabetic Retinopathy Classification. Diagnostics (Basel, doi.org/10.3390/jcm10112352.
Switzerland) [Internet]. 2023;13(3). Available from: http://www.ncbi.nlm.nih. 23. Heydon P, Egan C, Bolter L, Chambers R, Anderson J, Aldington S et al.
gov/pubmed/36766451. Prospective evaluation of an artificial intelligence-enabled algorithm for
7. Tsiknakis N, Theodoropoulos D, Manikis G, Ktistakis E, Boutsora O, Berto A et automated diabetic retinopathy screening of 30 000 patients. Br J Ophthal-
al. Deep learning for diabetic retinopathy detection and classification based mol [Internet]. 2021;105(5):723–8. Available from: http://www.ncbi.nlm.nih.
on fundus images: A review. Comput Biol Med [Internet]. 2021;135:104599. gov/pubmed/32606081.
Available from: http://www.ncbi.nlm.nih.gov/pubmed/34247130. 24. Ipp E, Liljenquist D, Bode B, Shah VN, Silverstein S, Regillo CD et al. Pivotal
8. Dubey S, Dixit M. Recent developments on computer aided systems for Evaluation of an Artificial Intelligence System for Autonomous Detection of
diagnosis of diabetic retinopathy: a review. Multimed Tools Appl [Internet]. Referrable and Vision-Threatening Diabetic Retinopathy. JAMA Netw open
2023;82(10):14471–525. Available from: http://www.ncbi.nlm.nih.gov/ [Internet]. 2021;4(11):e2134254. Available from: http://www.ncbi.nlm.nih.gov/
pubmed/36185322. pubmed/34779843.
9. Burlina PM, Joshi N, Pekala M, Pacheco KD, Freund DE, Bressler NM. Auto- 25. Bellemo V, Lim ZW, Lim G, Nguyen QD, Xie Y, Yip MYT et al. Artificial intel-
mated Grading of Age-Related Macular Degeneration From Color Fundus ligence using deep learning to screen for referable and vision-threatening
Images Using Deep Convolutional Neural Networks. JAMA Ophthalmol diabetic retinopathy in Africa: a clinical validation study. Lancet Digit Heal
[Internet]. 2017;135(11):1170–6. Available from: http://www.ncbi.nlm.nih.gov/ [Internet]. 2019;1(1):e35–44. Available from: http://www.ncbi.nlm.nih.gov/
pubmed/28973096. pubmed/33323239.
10. Peng Y, Dharssi S, Chen Q, Keenan TD, Agrón E, Wong WT et al. DeepSeeNet: 26. Zhang I, Zhou B, Crane AB, Ye C, Patton A, Habiel M, Szirth B, Khouri AS. Vision
A Deep Learning Model for Automated Classification of Patient-based Threatening Disease Triage Using Tele-Ophthalmology during COVID-19
Age-related Macular Degeneration Severity from Color Fundus Photographs. in the Emergency Department: A Pilot Study. Investigative Ophthalmology
Ophthalmology [Internet]. 2019;126(4):565–75. Available from: http://www. & Visual Science [Internet]. 2021;62(8):1893-. Available from: https://iovs.
ncbi.nlm.nih.gov/pubmed/30471319. arvojournals.org/article.aspx?articleid=2773482.
11. Leng X, Shi R, Wu Y, Zhu S, Cai X, Lu X et al. Deep learning for detection of 27. Scheetz J, Koca D, McGuinness M, Holloway E, Tan Z, Zhu Z et al. Real-world
age-related macular degeneration: A systematic review and meta-analysis of artificial intelligence-based opportunistic screening for diabetic retinopathy
diagnostic test accuracy studies. PLoS One [Internet]. 2023;18(4):e0284060. in endocrinology and indigenous healthcare settings in Australia. Sci Rep
Available from: http://www.ncbi.nlm.nih.gov/pubmed/37023082. [Internet]. 2021;11(1):15808. Available from: http://www.ncbi.nlm.nih.gov/
12. Paul SK, Pan I, Sobol WM, A SYSTEMATIC REVIEW OF DEEP LEARNING pubmed/34349130.
APPLICATIONS FOR OPTICAL, COHERENCE TOMOGRAPHY IN AGE-RELATED 28. Zhang Y, Shi J, Peng Y, Zhao Z, Zheng Q, Wang Z et al. Artificial intelligence-
MACULAR DEGENERATION. Retina [Internet]. 2022;42(8):1417–24. Available enabled screening for diabetic retinopathy: a real-world, multicenter and
from: http://www.ncbi.nlm.nih.gov/pubmed/35877964. prospective study. BMJ open diabetes Res care [Internet]. 2020;8(1). Available
13. Dong L, Yang Q, Zhang RH, Wei W, Bin. Artificial intelligence for the detection from: http://www.ncbi.nlm.nih.gov/pubmed/33087340.
of age-related macular degeneration in color fundus photographs: A system- 29. Dong L, He W, Zhang R, Ge Z, Wang YX, Zhou J et al. Artificial Intelligence for
atic review and meta-analysis. EClinicalMedicine [Internet]. 2021;35:100875. Screening of Multiple Retinal and Optic Nerve Diseases. JAMA Netw open
Available from: http://www.ncbi.nlm.nih.gov/pubmed/34027334. [Internet]. 2022;5(5):e229960. Available from: http://www.ncbi.nlm.nih.gov/
14. Li Z, He Y, Keel S, Meng W, Chang RT, He M. Efficacy of a Deep Learning Sys- pubmed/35503220.
tem for Detecting Glaucomatous Optic Neuropathy Based on Color Fundus 30. Beede E, Baylor E, Hersch F, Lurchenko A, Wilcox L, Ruamviboonsuk P, Vardou-
Photographs. Ophthalmology [Internet]. 2018;125(8):1199–206. Available lakis L. A Human-Centered Evaluation of a Deep Learning System Deployed
from: http://www.ncbi.nlm.nih.gov/pubmed/29506863. in Clinics for the Detection of Diabetic Retinopathy. In Proceedings of the
15. Yu S, Xiao D, Frost S, Kanagasingam Y. Robust optic disc and cup segmenta- 2020 CHI Conference on Human Factors in Computing Systems (CHI ‘20).
tion with deep learning for glaucoma detection. Comput Med Imaging Association for Computing Machinery, New York, NY, USA, 2020 April 1–12.
Graph [Internet]. 2019;74:61–71. Available from: http://www.ncbi.nlm.nih. https://doi.org/10.1145/3313831.3376718.
gov/pubmed/31022592. 31. Berkel N, Sarsenbayeva Z, Goncalves J. The methodology of studying fairness
16. Almazroa AA, Alodhayb S, Osman E, Ramadan E, Hummadi M, Dlaim M et al. perceptions in Artificial Intelligence: Contrasting CHI and FAccT, International
Retinal fundus images for glaucoma analysis: the RIGA dataset. In: Zhang J, Journal of Human-Computer Studies, Volume 170, 2023 Feb, 102954, ISSN
Chen P-H, editors. Medical Imaging 2018: Imaging Informatics for Healthcare, 1071–5819, https://doi.org/10.1016/j.ijhcs.2022.102954.
Research, and Applications [Internet]. SPIE; 2018. p. 8. Available from: https:// 32. Matthew BA, McDermott B, Nestor P, Szolovits. Clinical Artificial Intelligence:
www.spiedigitallibrary.org/conference-proceedings-of-spie/10579/2293584/ Design Principles and Fallacies, Clinics in Laboratory Medicine, Volume 43,
Retinal-fundus-images-for-glaucoma-analysis-the-RIGA-dataset/https://doi. Issue 1, 2023, Pages 29–46.
org/10.1117/12.2293584.full. 33. Wilkinson CP, Ferris FL, Klein RE, Lee PP, Agardh CD, Davis M et al. Proposed
17. Fu H, Cheng J, Xu Y, Zhang C, Wong DWK, Liu J et al. Disc-Aware Ensemble international clinical diabetic retinopathy and diabetic macular edema
Network for Glaucoma Screening From Fundus Image. IEEE Trans Med Imag- disease severity scales. Ophthalmology [Internet]. 2003;110(9):1677–82. Avail-
ing [Internet]. 2018;37(11):2493–501. Available from: http://www.ncbi.nlm. able from: http://www.ncbi.nlm.nih.gov/pubmed/13129861.
nih.gov/pubmed/29994764.
Skevas et al. BMC Ophthalmology (2024) 24:51 Page 12 of 12

34. Kanagasingam Y, Xiao D, Vignarajan J, Preetham A, Tay-Kearney M-L, Mehrotra 37. Michelson G, Wärntges S, Hornegger J, Lausen B. The papilla as screening
A. Evaluation of Artificial Intelligence-Based Grading of Diabetic Retinopathy parameter for early diagnosis of glaucoma. Dtsch Arztebl Int [Internet].
in Primary Care. JAMA Netw open [Internet]. 2018;1(5):e182665. Available 2008;105(34–35):583–9. Available from: http://www.ncbi.nlm.nih.gov/
from: http://www.ncbi.nlm.nih.gov/pubmed/30646178. pubmed/19471619.
35. Kreft D, McGuinness MB, Doblhammer G, Finger RP. Diabetic retinopathy
screening in incident diabetes mellitus type 2 in Germany between 2004 and
2013 - A prospective cohort study based on health claims data. PLoS One Publisher’s Note
[Internet]. 2018;13(4):e0195426. Available from: http://www.ncbi.nlm.nih.gov/ Springer Nature remains neutral with regard to jurisdictional claims in
pubmed/29621309. published maps and institutional affiliations.
36. Trautner C, Haastert B, Richter B, Berger M, Giani G. Incidence of blindness
in southern Germany due to glaucoma and degenerative conditions. Invest
Ophthalmol Vis Sci [Internet]. 2003;44(3):1031–4. Available from: http://www.
ncbi.nlm.nih.gov/pubmed/12601025.

You might also like