0% found this document useful (0 votes)

17 views25 pages

Iidt Record

Uploaded by

respectyt672

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views25 pages

Iidt Record

Uploaded by

respectyt672

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

DIABETES DETECTION USING

MACHINE LEARNING
A Internship project submitted in partial fulfillment of the requirement
for the award of the degree of

BACHELOR OF TECHNOLOGY
IN

COMPUTER SCIENCE & ENGINEERING

Submitted by
G. VENKATA RANGA REDDY Y21CSE279023
D.VENUGOPAL REDDY Y21CSE279018
C.SAI KUMAR REDDY Y21CSE279013

Under The Esteemed Guidance of

Dr. M. RAGHAVA NAIDU

Assistant Professor

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

KRISHNA UNIVERSITY COLLEGE OF ENGINEERING AND TECHNOLOGY
(Approved by AICTE, New Delhi)

MACHILIPATNAM-521-004

2023-2024
KRISHNA UNIVERSITY
COLLEGE OF ENGINEERING AND TECHNOLOGY
MACHILIPATNAM
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CERTIFICATE
This is so certify that the internship project work entitled "DIABETES DETECTION USING
MACHINE LEARNING” is a bonafide work done by C.SAI KUMAR REDDY
Regd. No: Y21CSE279023 submitted in partial fulfillment of the requirements for the award
of the Degree of "BACHELOR OF TECHNOLOGY IN COMPUTER SCIENCE &
ENGINEERING” during the academic year 2023-2024

Project Guide Head of the Department/Co-Ordinator

ACKNOWLEDGEMENT
I am deeply grateful to my project Platform IIDT BLACKBUKS for their invaluable guidance
and support throughout this project. Their insights and expertise have been instrumental in
shaping the development process. I would also like to thank my peers for their constructive
feedback and collaboration. Special thanks to my family for their encouragement and
understanding during this time.
The satisfaction that accompanies the successful completion of any task would be Incomplete
without the mention of the people who made it possible and whose constant guidance and
encouragement crown all the efforts with success. This acknowledgement transcends the reality
of formality when we would like to express deep gratitude and respect to all those people
behind the screen who guided, inspired and helped me for the completion of my project work
I am extremely grateful to my esteemed teacher and guide Dr.M.RAGHAVA NAIDU –
Assistant Professor, Department of Computer Science & Engineering, Krishna
University College of Engineering and Technology for his motivation and valuable
advices during the period.
I express my profound thanks to Dr.R.Vijaya Kumari (i/c), Principal, Krishna University
College of Engineering and Technology, for providing me with the opportunity and facilities
to pursue my project work.
I express my profound thanks to Prof. R.Srinivasa Rao (i/c), Vice Chancellor, and
Prof. K.Sobhan Babu, Registrar and Prof .M.V.Basaveswara Rao, Rector, Krishna
University, for their valuable motivation and providing all the required facilities for completing
my Mini Project.
I express my heart full thanks to all my Teaching and Non-Teaching Staff.

Sincerely,

C.SAI KUMAR REDDY

Y21CSE279023
DECLARATION

The internship project work reported in titled "DIABETES DETECTION", submitted in the
Department of COMPUTER SCIENCE AND ENGINEERING(CSE), Krishna University
College of Engineering and Technology, Machilipatnam in the partial fulfilment of degree for the
award of Bachelor of Technology, and is a bonafide work done by me.

The Report consists in the document is fully/partially owned by me under the guidance of our Guide and
Trained by the IIDT-APSCHE partner BLACKBUCKS of AI & ML INTERNSHIP

The reported results are based on the project work entirely done by me and not copied from any
other source.

Sincerely,

C.SAI KUMAR REDDY

Y21CSE279023
ABSTRACT

Diabetes is one of the chronic diseases that causes blood sugar levels to rise. If
diabetes is left untreated and undiagnosed, it can lead to complications. The time-
consuming identification process leads to a patient's referral to a diagnostic Centre
and consultation with a doctor. Predictive analytics in healthcare is a difficult
challenge, but it can eventually assist physicians in making timely decisions about a
patient's health and condition based on data. The emergence of machine learning
methods solves this crucial issue.

The aim of this project is to create a model that can reliably predict the accuracy
of diabetes in patients. Dataset splits into three then classification techniques are
implemented. Training Dataset, Dataset sample that is used to fit the model.
Validation Dataset, Dataset sample that is used for hyper tuning the parameters, and
comparing the accuracy and error rates of the model performance between using
the training dataset and the validation dataset. Testing Dataset, Dataset sample that
is used to test the model performance (predictive power).

To detect diabetes at an early stage, this project employs machine learning

classification algorithms: Logistic Regression, Gaussian Naive Bayes, K
Neighbours, SVM, Decision tree, Random Forest, Bagging Classifier, Ada Boost
Classifier and Gradient Boosting Classifier are implemented. The Pima Indians
Diabetes Database (PIDD) is used in the experiments. The National Institute of
Diabetes and Digestive and Kidney Diseases provided the results. The dataset's
purpose is to diagnose whether a patient has diabetes using diagnostic measures
included in the dataset. Various measures like Precision, Accuracy, Specificity, and
Recall are measured over classified instances using Confusion Matrix.

The accuracy of the algorithms used are compared and discussed. The study's
comparison of the various machine learning techniques shows which algorithm is
better suited for diabetes prediction. Using machine learning methods, this project
aims to assist doctors and physicians in the early detection of diabetes.
INDEX

S.No CONTENTS Pg.No

1. Introduction
1.1 Introduction 1
1.2 Objectives
1.3 Motivation

1.4 Overview of the Project

1.5 Chapter wise Summary

2. Data Analysis 4
2.1 Structure of Data
2.2 Parameters Implemented
2.3 Exploratory Data analysis
2.4 Histogram plot of data
2.5 Box plot of data
2.6 Distribution of classes

3. Implementation 12
3.1 Splitting of Dataset
3.2 Feature Scaling
3.3 Implementing Machine Learning Algorithms
3.3.1 Correlation Heat Map
3.3.2 Logistic Regression Model
3.3.3 Support Vector Machine Model (Svc)
3.3.4 Decision Tree Model
3.3.5 K-Neighbours Classifier Model

4. Test results 18
5. Conclusions and Further Scope 18
1.INTRODUCTION

1.1 INTRODUCTION
Various classification strategies are used in the medical field for classifying data
into different classes. Diabetes is a condition that affects the body's ability to
produce the hormone insulin, which causes carbohydrate metabolism to become
irregular and blood glucose levels to increase. High blood sugar is a common
symptom of diabetes. If diabetes is not treated, it can lead to a variety of
complications. Diabetic ketoacidosis and nonketotic hyperosmolar coma are two
significant complications. Diabetes is considered a severe health problem in which
the amount of sugar in the blood cannot be regulated. Diabetes is influenced by a
variety of factors such as height, weight, genetic factors, and insulin, but the most
important factor to remember is sugar concentration. The only way to avoid
problems is to identify the problem early. This dataset comes from the ‘National

Institute of Diabetes and Digestive Diseases’ Pima Indians Diabetes Database

(PIDD). Several constraints were taken from the massive database.

The dataset is divided into three sections, after which classification techniques are
used. The training dataset is a sample of the dataset that is used to match the model.
Validation Dataset, a dataset sample used for fine-tuning parameters and

1
comparing model output accuracy and error rates between the training and validation
datasets. Testing Dataset is a sample of a dataset that is used to assess the model's
output.

Various machine learning techniques are implemented. Confusion matrix is

obtained and is compared with all classification algorithms. This comparison of the
various machine learning techniques shows which algorithm is better suited for
diabetes prediction. Correlation between parameters and the best accuracy score
using various supervised machine learning algorithms is obtained.

1.2 OBJECTIVES

• Since a decade, the number of people diagnosed with diabetes has risen
significantly. The current human lifestyle is the primary cause of diabetes
rise.
• Main objective of this project is to analyze the data, and see if it is possible
to gleam any further information from the data to determine correlation
between parameters and diabetes.
• The second is to attempt to get the best accuracy score using various
supervised learning machine learning algorithms. To find out which
algorithm is able to best predict whether a person has diabetes or not based
on this dataset.
• The accuracy of the algorithms used are compared and discussed. The
study's comparison of the various machine learning techniques shows which
algorithm is better suited for diabetes prediction. Using machine learning
methods, this project aims to assist doctors and physicians for predicting
whether a person has diabetes or not.

2
1.3 MOTIVATION
The current human lifestyle is the primary cause of increasing diabetes. The three
types of errors that may occur in today's medical diagnosis method:

1. The false-negative form, in which a patient is diabetic in fact but test results show
that he or she does not have diabetes.

2. The false-positive type. In this type, a patient in reality is not a diabetic patient
but test reports say that he/she is a diabetic patient.

3. The third type is an unclassifiable type in which a system cannot diagnose a

given case. This happens because of insufficient knowledge extraction from past
data, a given patient may get predicted in an unclassified type.

However, in fact, the patient must predict whether he or she will be diabetic or non-diabetic.
Such diagnostic errors can result in unnecessary treatments or no treatments at all when they
are needed. To prevent or mitigate the magnitude of such an effect, a machine learning
algorithm must be used to build a framework that provides reliable results while reducing
human effort.

1.4 OVERVIEW OF PROJECT

Machine learning has the great ability to revolutionize the diabetes risk prediction with the help
of advanced computational methods and availability of a large amount of epidemiological and
genetic diabetes risk dataset. Detection of diabetes in its early stages is the key for treatment.
This work has described a machine learning approach to predicting diabetes or not. The
technique may also help researchers to develop an accurate and effective tool that will reach at
the table of clinicians to help them make better decisions about disease status.

1.5 CHAPTERWISE SUMMARY

The first chapter is an introductory chapter, which gives an overview of the project. It includes
four divisions - introduction, objectives, motivation, overview and chapter wise summary. The
second chapter is data analysis, where the dataset is analyzed and studied for further
classifications. Third chapter deals with the different machine learning models used. To detect
diabetes at an early stage, this project employs machine learning classification algorithms:
Logistic Regression, Gaussian Naive Bayes, K Neighbours, SVM, Decision tree, Random
Forest, Bagging Classifier, AdaBoost Classifier and Gradient Boosting Classifier are
implemented. The last chapter gives an elaborate idea about the results of different models.
Let’s get to know more about the dataset in the upcoming chap2

3
2.DATA ANALYSIS

2.1 STRUCTURE OF DATA

The dataset is originally from the Kaggle data repository. The objective of the dataset is to
diagnostically predict whether or not a patient has diabetes, based on certain diagnostic
measurements included in the dataset. Several constraints were placed on the selection of these
instances from a larger database. In particular, all patients here are females at least 21 years old
of Pima Indian heritage. The datasets consist of several medical predictor variables and one
target variable, Outcome. Predictor variables include the number of pregnancies the patient has
had, their BMI, insulin level, age etc.

Fig:- Importing Libraries

4
Fig:- Importing libraries to implement various machine learning for
classification techniques.

Fig:- Loading Dataset

Fig:-Loading the dataset to understand data structure.

Fig:- Shape of dataset

Fig:- represent total number of rows and columns in Dataset

2.2 PARAMETERS IMPLEMENTED

Pregnancies: No. of times pregnant

Glucose: Plasma glucose concentration for 2 hours in an oral glucose tolerance
test.

5
Blood Pressure: Diastolic blood pressure (mm Hg). It is the bottom number in blood
pressure tests, and is the pressure in the arteries when the heart rests between beats.
A normal diastolic blood pressure is < 80 mm HG.

Skin Thickness: Triceps skin fold thickness (mm). Studies have been conducted,
with conclusions that there are associations between people with thicker skin and
diabetes.

Insulin: 2-Hour serum insulin (mu U/ml). Insulin is a hormone made by the
pancreas that allows your body to use sugar (glucose) from carbohydrates in the
food that you eat for energy or to store glucose for future use. A high insulin level
is associated with diabetes.

BMI: Body mass index (weight in kg/ (height

in m) ^2) Range of BMI:

BMI < 18.5 - underweight

18.5 < BMI < 24.9 - ideal weight
25 < BMI < 29.9 - overweight
29.9 < BMI - obese
Diabetes Pedigree Function: It is a synthesis of the diabetes mellitus history in
relatives and the genetic relationship of those relatives to the subject.

Results show that a person with a higher pedigree function tested positive and
those who had a lower pedigree function tested negative.

Age: Age of the patient in years

Outcome: The target column which we are interested in finding out. 1 - diabetic, 0
- non-diabetic

6
2.3 EXPLORATORY DATA ANALYSIS

Fig 2.3.1 Exploratory Data Analysis

Fig 2.3.1, is analyzing the dataset and checking any missing values.

Fig:- Dataset Information

Fig:-Dataset information's are checked.

7
Fig:- Calculating Mean, Count, Min, Max and Standard Deviation.

2.4 HISTOGRAM PLOT OF DATA

The below histogram plots give a high-level view of the bucket distribution of the
dataset parameters. At first glance, most of them appear to be positively skewed,
with Glucose and Blood Pressure with the closest distribution to a normal
distribution. Outcome is a bimodal distribution which is to be expected.

8
2.5 BOX PLOT OF DATA
plt.figure(figsize=(12,12))
i=1
for col in dt.iloc[:,:-1]:
plt.subplot(4,4,i)
dt[[col]].boxplot()
i+=1

9
2.6 DISTRIBUTION OF CLASSES USING PIE CHART & BAR CHART

Pie Chart: A pie chart is ideal for illustrating the relative proportions or percentages of
different classes within a dataset. In the case of diabetes detection prediction, a pie chart could
be used to show the distribution of outcomes, such as the percentage of patients classified as
diabetic versus non-diabetic. Each segment of the pie represents a class (e.g., diabetic or non-
diabetic), and the size of each segment corresponds to its proportion of the whole dataset. This
visualization method helps stakeholders quickly grasp the balance or imbalance between
different predicted outcomes.
Bar Chart: A bar chart is useful for comparing quantities across different categories. In the
context of diabetes detection prediction, a bar chart could display the absolute counts of each
class, such as the number of individuals predicted as diabetic and non-diabetic. Each bar
represents a class, and the height of the bar indicates the frequency or count of instances in that
class. This allows for a straightforward comparison of class distributions and can highlight any
disparities or trends in prediction outcomes.

10
11
3 . IMPLEMENTATION

3.1 SPLITTING OF DATASET (TRAINING/VAILDATION/TESTING)

The splitting of the dataset for validation and testing. Training
Dataset: Dataset sample that is used to fit the model.
Validation Dataset: Dataset sample that is used for hyper tuning the parameters, and
comparing the accuracy and error rates of the model performance between using
the training dataset and the validation dataset.
Testing Dataset: Dataset sample that is used to test the model performance
(predictive power).

3.2Feature Scaling

Here StandardScaler() is used to perform feature scaling. This will retain the mean
and the standard deviation of the sample distribution of the data set, and reuse it to
transform the X_train and X_test subsequently. I try to reuse the mean and standard
deviation obtained from the training set and apply it to the testing set as well.
Standardizing data after data splitting is to prevent data leakage from test dataset
into train dataset.

12
3.3 Implementing Machine Learning Algorithms

Different machine learning algorithms to try and classify the pima Indian diabetes
dataset. First a confusion matrix function is formed.
Accuracy: (TP+TN)/All
Recall: TP/(TP+FN)
Precision: TP/(TP+FP)
Specificity: TN/(TN+FP)

3.3.1 CORRELATION HEATMAP

plt.figure(figsize=(10, 8))
correlation_matrix = dt.corr()

13
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=0.5)
plt.title('Correlation Heatmap')
plt.show()

14
3.3.2 Logistic Regression Model

3.3.3Support Vector Machine Model

15
3.3.4 Decision Tree Model

3.3.5 K Neighbours Classifier Model

16
17
4 TEST REULTS ANALYSIS
Finally, we have trained our models, and summarized table of the metrics of the
various models.
Objectives were,
1) To attempt to see if it is possible to glean any further information from the
data to determine correlation between parameters and diabetes.
2) To attempt to get the best accuracy score using various supervised learning
machine learning algorithms.
For the first objective, based on the hypothesis test, we can tell that glucose levels
are positively correlated to a person having diabetes, but we are not able to confirm
if there is causality. For the second objective, based on the comparison between the
various algorithms used, Random Forest seems to produce the best results to me.

The aim of this project is to create a model that can reliably predict the accuracy of
diabetes in patients. The main aim of this project is to design and implement
Diabetes Prediction Using Machine Learning Methods and Performance Analysis
of that methods and it has been achieved successfully.

The proposed approach uses various classification and ensemble learning method
in which SVM, KNN, Random Forest, Decision Tree, Logistic regression and
Gradient Boosting classifiers are used. A machine learning algorithm must be used
to build a framework that provides reliable results while reducing human effort.

The test accuracy of the various models is generally within the same range, from
approximately 73% to 81%. Based on Accuracy and Recall score, overly the
Random Forest Classifier produced the best results.

5 CONCLUSION AND FUTURE SCOPES

• Machine learning has the great ability to revolutionize the diabetes

prediction with the help of advanced computational methods.

• Detection of Diabetes in its early stage is the key for treatment.

• The technique may also help researchers to develop an accurate and

efficient tool that will reach at the table of clinicians to help them make
better decisions about the disease.

• More parameters and factors would be involved in the future scope of this
project.

• The accuracy will increase even more when the parameters increase.

18
REFERENCES

1. Kaggle
2. GitHub
3. World Health Organization (WHO)
4. American Diabetes Association (ADA)

Project Report On Diabetes Prediction
No ratings yet
Project Report On Diabetes Prediction
29 pages
Final Diabetes Prediction Documentation
No ratings yet
Final Diabetes Prediction Documentation
52 pages
Diabetes Documentation
No ratings yet
Diabetes Documentation
54 pages
Diabetes
No ratings yet
Diabetes
73 pages
Diabetes Prediction with ML
No ratings yet
Diabetes Prediction with ML
38 pages
Ipsita PR
No ratings yet
Ipsita PR
41 pages
Mini Docs Batch 7
No ratings yet
Mini Docs Batch 7
49 pages
Estimaing Diabetic Risk Accurately (Documentation)
No ratings yet
Estimaing Diabetic Risk Accurately (Documentation)
56 pages
Diabets Prediction System Using Machine Learning Techiques: Jawaharlal Nehru Technological University Hyderabad
No ratings yet
Diabets Prediction System Using Machine Learning Techiques: Jawaharlal Nehru Technological University Hyderabad
47 pages
CPP Final Reportt
No ratings yet
CPP Final Reportt
15 pages
Comparative Analysis of Machine Learning Algorithms Using Diabetes Dataset
100% (1)
Comparative Analysis of Machine Learning Algorithms Using Diabetes Dataset
35 pages
Tamiliniyan Santhosh 92
No ratings yet
Tamiliniyan Santhosh 92
34 pages
Final Pro
No ratings yet
Final Pro
51 pages
Diabetes Thesis1
No ratings yet
Diabetes Thesis1
20 pages
Kanak Blackbook Project
No ratings yet
Kanak Blackbook Project
57 pages
Project Documentation of Diabetese Detection Using KNN Algorithm
No ratings yet
Project Documentation of Diabetese Detection Using KNN Algorithm
47 pages
Internship Report ML'
No ratings yet
Internship Report ML'
36 pages
Diabetes Detection System
No ratings yet
Diabetes Detection System
35 pages
Disease Pred
No ratings yet
Disease Pred
42 pages
Project Report Diabetes
No ratings yet
Project Report Diabetes
31 pages
Final Document1
No ratings yet
Final Document1
126 pages
Major Project
No ratings yet
Major Project
53 pages
Diabets Project Document3
No ratings yet
Diabets Project Document3
60 pages
Pro 1
No ratings yet
Pro 1
11 pages
CSD Project Batch 4
No ratings yet
CSD Project Batch 4
22 pages
Sunny PP2
No ratings yet
Sunny PP2
48 pages
Wa0032.
No ratings yet
Wa0032.
78 pages
Seminar Report Shanu Saklani
No ratings yet
Seminar Report Shanu Saklani
22 pages
Minor Project Report
No ratings yet
Minor Project Report
69 pages
Organized
No ratings yet
Organized
5 pages
46 MP
No ratings yet
46 MP
59 pages
Nimmu Team 2
No ratings yet
Nimmu Team 2
80 pages
Mini Report
No ratings yet
Mini Report
49 pages
Diabetes Project MuskanAltaf
No ratings yet
Diabetes Project MuskanAltaf
15 pages
Parkinson's Disease Detection
100% (1)
Parkinson's Disease Detection
88 pages
Final Project Report
No ratings yet
Final Project Report
76 pages
Report
No ratings yet
Report
47 pages
Crave Safe
No ratings yet
Crave Safe
26 pages
Major Project Rep
No ratings yet
Major Project Rep
29 pages
Diabetes Prediction Report
No ratings yet
Diabetes Prediction Report
41 pages
Diabetes Prediction with KNN
No ratings yet
Diabetes Prediction with KNN
49 pages
GI FS Removed
No ratings yet
GI FS Removed
9 pages
Major Project Final TABLE DIAGRAM
No ratings yet
Major Project Final TABLE DIAGRAM
28 pages
Ilovepdf Merged Removed
No ratings yet
Ilovepdf Merged Removed
33 pages
Updated Report 2
No ratings yet
Updated Report 2
74 pages
Project Report Minor
No ratings yet
Project Report Minor
33 pages
Bro Project
No ratings yet
Bro Project
49 pages
Predictive Model For Diabetes Using Machine Learning
No ratings yet
Predictive Model For Diabetes Using Machine Learning
38 pages
STV Final Report New2
No ratings yet
STV Final Report New2
48 pages
Diabetes Pridiction Using Machine Learning
No ratings yet
Diabetes Pridiction Using Machine Learning
31 pages
c14 Final Document
No ratings yet
c14 Final Document
72 pages
Diabetes Prediction Using ML
No ratings yet
Diabetes Prediction Using ML
29 pages
Report 4227
No ratings yet
Report 4227
29 pages
Mini Project Report
No ratings yet
Mini Project Report
34 pages
Handwriting Recognition: Chappidi Aswarta Reddy (Urk18Cs080)
No ratings yet
Handwriting Recognition: Chappidi Aswarta Reddy (Urk18Cs080)
27 pages
Individual - Praneeth - Small
No ratings yet
Individual - Praneeth - Small
63 pages
Final Project Report Format
No ratings yet
Final Project Report Format
27 pages
Multiple Disease Prediction
No ratings yet
Multiple Disease Prediction
71 pages
Insulin Dosage Prediction System
No ratings yet
Insulin Dosage Prediction System
80 pages
Data Science Project
No ratings yet
Data Science Project
25 pages
Business Data Mining Course Guide
No ratings yet
Business Data Mining Course Guide
3 pages
DM & DW
No ratings yet
DM & DW
2 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Tutorial
No ratings yet
Tutorial
52 pages
Bca Vi May2018 Data Mining and Data Warehousing
No ratings yet
Bca Vi May2018 Data Mining and Data Warehousing
2 pages
Leaf Disease Detection Guide
No ratings yet
Leaf Disease Detection Guide
29 pages
Prerna Sharma 2020
No ratings yet
Prerna Sharma 2020
14 pages
Data Mining Course Overview
No ratings yet
Data Mining Course Overview
38 pages
Facial Recognition and Machine Learning-Based Student Attendance Monitoring System
No ratings yet
Facial Recognition and Machine Learning-Based Student Attendance Monitoring System
7 pages
Unveiling The Power: A Comparative Analysis of Data Mining Tools Through Decision Tree Classification On The Bank Marketing Dataset
No ratings yet
Unveiling The Power: A Comparative Analysis of Data Mining Tools Through Decision Tree Classification On The Bank Marketing Dataset
11 pages
Linear Discriminant Analysis (LDA) Tutorial
No ratings yet
Linear Discriminant Analysis (LDA) Tutorial
2 pages
A Review On Sentiment Analysis Methodologies Practices and Applications
No ratings yet
A Review On Sentiment Analysis Methodologies Practices and Applications
9 pages
1 s2.0 S2405918822000095 Main
No ratings yet
1 s2.0 S2405918822000095 Main
22 pages
ML Course File (R18)
No ratings yet
ML Course File (R18)
51 pages
Underwater Image Enhancement Using GAN
No ratings yet
Underwater Image Enhancement Using GAN
9 pages
Spam Text Detection for Social Media
No ratings yet
Spam Text Detection for Social Media
8 pages
Data Mining Project
100% (1)
Data Mining Project
24 pages
Exam2021-2022 (Jan C)
No ratings yet
Exam2021-2022 (Jan C)
3 pages
Linear Regression Assignment
No ratings yet
Linear Regression Assignment
49 pages
Neural Networks and Fuzzy Systems: Neurolab
No ratings yet
Neural Networks and Fuzzy Systems: Neurolab
17 pages
AI-Powered Smart Waste Solutions
No ratings yet
AI-Powered Smart Waste Solutions
16 pages
Applied Statistics For Bioinformatics PDF
No ratings yet
Applied Statistics For Bioinformatics PDF
278 pages
Density Based CA
No ratings yet
Density Based CA
8 pages
Hybrid CNN-SVM for Digit Recognition
No ratings yet
Hybrid CNN-SVM for Digit Recognition
8 pages
Thesis - Aru Omarali
No ratings yet
Thesis - Aru Omarali
34 pages
Enoch Project
No ratings yet
Enoch Project
39 pages
Ain Shams University Faculty of Engineering
No ratings yet
Ain Shams University Faculty of Engineering
2 pages
Foreseeing Employee Attritions Using Div
No ratings yet
Foreseeing Employee Attritions Using Div
7 pages
Predicción de Cancer de Mama Con Machine Learning y Comparación de F1 Score
No ratings yet
Predicción de Cancer de Mama Con Machine Learning y Comparación de F1 Score
13 pages

Iidt Record

Uploaded by

Iidt Record

Uploaded by

DIABETES DETECTION USING

COMPUTER SCIENCE & ENGINEERING

Under The Esteemed Guidance of

Dr. M. RAGHAVA NAIDU

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Project Guide Head of the Department/Co-Ordinator

C.SAI KUMAR REDDY

C.SAI KUMAR REDDY

To detect diabetes at an early stage, this project employs machine learning

S.No CONTENTS Pg.No

1.4 Overview of the Project

Institute of Diabetes and Digestive Diseases’ Pima Indians Diabetes Database

Various machine learning techniques are implemented. Confusion matrix is

3. The third type is an unclassifiable type in which a system cannot diagnose a

1.4 OVERVIEW OF PROJECT

1.5 CHAPTERWISE SUMMARY

2.1 STRUCTURE OF DATA

Fig:- Importing Libraries

Fig:- Loading Dataset

Fig:- Shape of dataset

2.2 PARAMETERS IMPLEMENTED

Pregnancies: No. of times pregnant

BMI: Body mass index (weight in kg/ (height

BMI < 18.5 - underweight

Age: Age of the patient in years

Fig 2.3.1 Exploratory Data Analysis

Fig:- Dataset Information

2.4 HISTOGRAM PLOT OF DATA

3.1 SPLITTING OF DATASET (TRAINING/VAILDATION/TESTING)

3.3.1 CORRELATION HEATMAP

3.3.3Support Vector Machine Model

3.3.5 K Neighbours Classifier Model

5 CONCLUSION AND FUTURE SCOPES

• Machine learning has the great ability to revolutionize the diabetes

• Detection of Diabetes in its early stage is the key for treatment.

• The technique may also help researchers to develop an accurate and

You might also like