0% found this document useful (0 votes)

66 views3 pages

9 A.validation Methods - Jupyter Notebook

The document loads diabetes dataset, splits it into training and test sets, fits a logistic regression model to the training set and evaluates it on the test set. It then performs 3 types of cross validation: K-fold, Leave-one-out, and Repeated K-fold cross validation and calculates the mean accuracy for each.

Uploaded by

venkatesh m

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views3 pages

9 A.validation Methods - Jupyter Notebook

Uploaded by

venkatesh m

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

In

[1]: import numpy as np

import matplotlib.pyplot as plt
import pandas as pd

In [2]: data = pd.read_csv("D:\\Course\\Python\\Datasets\\pima-indians-diabetes.csv")

In [3]: data

...

In [5]: # Divide Data into X and Y

array = data.values
X = array[:,0:8]
Y = array[:,8]

Hold Out Validations

In [6]: from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

In [7]: # Split the data

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33, random_

In [8]: model = LogisticRegression()

In [9]: model.fit(X_train, Y_train)

result = model.score(X_test, Y_test)

C:\Users\rgandyala\Anaconda3\lib\site-packages\sklearn\linear_model\_logistic.p
y:763: ConvergenceWarning: lbfgs failed to converge (status=1):

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:

https://scikit-learn.org/stable/modules/preprocessing.html (https://scikit-
learn.org/stable/modules/preprocessing.html)

Please also refer to the documentation for alternative solver options:

https://scikit-learn.org/stable/modules/linear_model.html#logistic-regressi
on (https://scikit-learn.org/stable/modules/linear_model.html#logistic-regressi
on)

n_iter_i = _check_optimize_result(

In [10]: # Predicting the Test set results

y_pred = model.predict(X_test)
In [11]: # Checking accuracy score
from sklearn.metrics import accuracy_score
accuracy_score(Y_test, y_pred)

Out[11]: 0.7716535433070866

K FOLD Validations
In [12]: from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

In [14]: # Initialize parameters

num_folds = 10
kfold = KFold(n_splits=num_folds)
model1 = LogisticRegression()

In [15]: # Fitting the model and Extracting the results

results1 = cross_val_score(model1, X, Y, cv=kfold)

...

In [16]: results1

...

In [17]: print(results1.mean()100.0, results1.std()100.0)

...

Leave One Out Cross Validation

In [18]: from sklearn.model_selection import LeaveOneOut
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

In [19]: # Initialize parameters

loocv = LeaveOneOut()
model2 = LogisticRegression()

In [20]: # Fitting the model and Extracting the results

results2 = cross_val_score(model2, X, Y, cv=loocv)

...

In [21]: print(results2.mean()100.0, results2.std()100.0)

77.05345501955672 42.04890690023727

Repeated K Fold Cross Validation

In [22]: from sklearn.model_selection import RepeatedKFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

In [23]: # Initialize parameters

n_splits = 10
kfold3 = RepeatedKFold(n_splits=n_splits, n_repeats=2)
model3 = LogisticRegression()

In [24]: # Fitting the model and Extracting the results

results3 = cross_val_score(model3, X, Y, cv=kfold3)

...

In [26]: # Check the Accuracy

print("Accuracy: ", results3*100.0)

Accuracy: [75.32467532 72.72727273 72.72727273 76.62337662 80.51948052 79.2207

7922

75.32467532 86.84210526 67.10526316 80.26315789 81.81818182 77.92207792

80.51948052 76.62337662 76.62337662 80.51948052 75.32467532 67.10526316

85.52631579 77.63157895]

In [27]: print(results3.mean()100.0, results3.std()100.0)

77.31459330143541 4.929298671034974

In [ ]:

Oracle Cloud Infrastructure AI Foundations - Oracle MyLearn
0% (1)
Oracle Cloud Infrastructure AI Foundations - Oracle MyLearn
1 page
Salary Prediction with ML Models
No ratings yet
Salary Prediction with ML Models
5 pages
Oraclegg Part3 Trouble
No ratings yet
Oraclegg Part3 Trouble
41 pages
Rhel5 Guide I731
No ratings yet
Rhel5 Guide I731
200 pages
Database Components - MariaDB
No ratings yet
Database Components - MariaDB
4 pages
Database Design Best Practices
No ratings yet
Database Design Best Practices
2 pages
Machine Learning: Engr. Ejaz Ahmad
No ratings yet
Machine Learning: Engr. Ejaz Ahmad
54 pages
SP
No ratings yet
SP
4 pages
CLT 2018 Mariadb 10 2
No ratings yet
CLT 2018 Mariadb 10 2
48 pages
Using Statspack To Track Down Bad Code
No ratings yet
Using Statspack To Track Down Bad Code
11 pages
RedHat5 Manual NSA
No ratings yet
RedHat5 Manual NSA
200 pages
Linux Hardening - TFG-B. 1910
No ratings yet
Linux Hardening - TFG-B. 1910
134 pages
Performance Tuning Addedinfo Oracle
No ratings yet
Performance Tuning Addedinfo Oracle
49 pages
Oracle GoldenGate Best Practices - Configuring Oracle GoldenGate For Teradata Databases V5a ID1323119.1-1
No ratings yet
Oracle GoldenGate Best Practices - Configuring Oracle GoldenGate For Teradata Databases V5a ID1323119.1-1
43 pages
Advanced Scikit Learn
No ratings yet
Advanced Scikit Learn
98 pages
Oracle Statistics Guide for DBAs
No ratings yet
Oracle Statistics Guide for DBAs
13 pages
Mysql For Oracle Dbas and Developers
No ratings yet
Mysql For Oracle Dbas and Developers
65 pages
Interpret Statspack Report
No ratings yet
Interpret Statspack Report
9 pages
Oracle Statistics
No ratings yet
Oracle Statistics
26 pages
MariaDB 10.4 Features Overview
No ratings yet
MariaDB 10.4 Features Overview
25 pages
SQL Tuning
100% (6)
SQL Tuning
51 pages
Supervised Learning - Regression - Annotated
No ratings yet
Supervised Learning - Regression - Annotated
97 pages
Cripts IN Lender: Author: N.tox
No ratings yet
Cripts IN Lender: Author: N.tox
9 pages
Naturopolis Prototype
No ratings yet
Naturopolis Prototype
4 pages
Nsa Rhel 5 Guide v4.2
No ratings yet
Nsa Rhel 5 Guide v4.2
200 pages
Sparkly R
No ratings yet
Sparkly R
2 pages
OracleDBA Resume 13+years
No ratings yet
OracleDBA Resume 13+years
3 pages
Mysql, Oracle, Sqlserver
No ratings yet
Mysql, Oracle, Sqlserver
9 pages
Mysql Replication With Heartbeat and DRBD
100% (3)
Mysql Replication With Heartbeat and DRBD
16 pages
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
100% (1)
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
11 pages
Oracle 11g on RHEL 5.7: Quick Setup Guide
No ratings yet
Oracle 11g on RHEL 5.7: Quick Setup Guide
6 pages
ML Module 5 2022 PDF
100% (2)
ML Module 5 2022 PDF
31 pages
PDF Deep Learning For Remote Sensing Images With Open Source Software 1St Edition Remi Cresson Ebook Full Chapter
No ratings yet
PDF Deep Learning For Remote Sensing Images With Open Source Software 1St Edition Remi Cresson Ebook Full Chapter
53 pages
Introduction To Spark With Sparklyr in R
No ratings yet
Introduction To Spark With Sparklyr in R
11 pages
Data Preprocessing
No ratings yet
Data Preprocessing
38 pages
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
No ratings yet
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
17 pages
Data Science in Spark With Sparklyr::: Cheat Sheet
No ratings yet
Data Science in Spark With Sparklyr::: Cheat Sheet
2 pages
Scikit Learn Cheat Sheet
No ratings yet
Scikit Learn Cheat Sheet
9 pages
Regression Problems in Python PDF
No ratings yet
Regression Problems in Python PDF
34 pages
Oracle Database Tuning Guide
100% (1)
Oracle Database Tuning Guide
42 pages
AI Phase3
No ratings yet
AI Phase3
2 pages
Logistic Pima Indians - Ipynb - Colaboratory
No ratings yet
Logistic Pima Indians - Ipynb - Colaboratory
4 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
6 pages
Linear Model (Classification)
No ratings yet
Linear Model (Classification)
33 pages
Machine Learning Lab Analysis
No ratings yet
Machine Learning Lab Analysis
15 pages
Unit5 - Logistic Regression
No ratings yet
Unit5 - Logistic Regression
4 pages
ML Exp 8
No ratings yet
ML Exp 8
22 pages
AbhishekVallecha 2003184 ADS Exp8
No ratings yet
AbhishekVallecha 2003184 ADS Exp8
4 pages
ML Journal
No ratings yet
ML Journal
45 pages
1rst Exp
No ratings yet
1rst Exp
3 pages
Clase-02-ML - Colab
No ratings yet
Clase-02-ML - Colab
5 pages
Exp 3
No ratings yet
Exp 3
7 pages
Kmeans
No ratings yet
Kmeans
2 pages
Logistic Regression - Ipynb - Colab
No ratings yet
Logistic Regression - Ipynb - Colab
2 pages
Annotated Follow-Along Guide - Construct A Logistic Regression Model With Python
No ratings yet
Annotated Follow-Along Guide - Construct A Logistic Regression Model With Python
7 pages
23UCC554
No ratings yet
23UCC554
9 pages
Python Breast Cancer Prediction Guide
No ratings yet
Python Breast Cancer Prediction Guide
8 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
Payal Practical5 Edited
No ratings yet
Payal Practical5 Edited
5 pages
8 - Logistic - Regression - Multiclass - Ipynb - Colaboratory
No ratings yet
8 - Logistic - Regression - Multiclass - Ipynb - Colaboratory
6 pages
SVM Classification on Iris Dataset
No ratings yet
SVM Classification on Iris Dataset
4 pages
2 Basic of Python - Functions
No ratings yet
2 Basic of Python - Functions
3 pages
6 XG Boost - Jupyter Notebook
100% (1)
6 XG Boost - Jupyter Notebook
3 pages
Python Data Analysis with Pandas
100% (1)
Python Data Analysis with Pandas
7 pages
1 Basics of Python
No ratings yet
1 Basics of Python
6 pages
1 KNN - Jupyter Notebook
No ratings yet
1 KNN - Jupyter Notebook
3 pages
5 Random Forest - Jupyter Notebook
No ratings yet
5 Random Forest - Jupyter Notebook
2 pages
1 Simple Linear Regression
No ratings yet
1 Simple Linear Regression
9 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Statistics Homework for Students
No ratings yet
Statistics Homework for Students
16 pages
Advanced ML: Consistency & Algorithms
No ratings yet
Advanced ML: Consistency & Algorithms
3 pages
Stock Market Prediction Using Machine Learning
No ratings yet
Stock Market Prediction Using Machine Learning
19 pages
Metrology & Measurement Reliability Guide
No ratings yet
Metrology & Measurement Reliability Guide
109 pages
Stats and Probabilty Reviewer 4th Quarter
No ratings yet
Stats and Probabilty Reviewer 4th Quarter
6 pages
Chapter 4 (Hypothesis Testing)
No ratings yet
Chapter 4 (Hypothesis Testing)
20 pages
4.introduction To Biostatistics
No ratings yet
4.introduction To Biostatistics
30 pages
Unit Xiii - Analysis of Time Series: Notes Structure
No ratings yet
Unit Xiii - Analysis of Time Series: Notes Structure
15 pages
For KS Diagnosis Specs Limits Shell LubeAnalyst Condemnation Limits Mar 081
100% (4)
For KS Diagnosis Specs Limits Shell LubeAnalyst Condemnation Limits Mar 081
56 pages
100 MCQs For Research Methodology
No ratings yet
100 MCQs For Research Methodology
10 pages
Muthee
No ratings yet
Muthee
26 pages
Statistical Data Analysis Summary
No ratings yet
Statistical Data Analysis Summary
3 pages
Urban Management Revised Handbook
No ratings yet
Urban Management Revised Handbook
60 pages
Class: Ix Subject: Mathematics Assignment 12: Statistics
100% (1)
Class: Ix Subject: Mathematics Assignment 12: Statistics
2 pages
Bcoc - 134
No ratings yet
Bcoc - 134
4 pages
Moderation Analysis for Researchers
No ratings yet
Moderation Analysis for Researchers
56 pages
Theory of Probability - 3
No ratings yet
Theory of Probability - 3
6 pages
Perceived Parenting Style and Young Adult's Self-Efficacy: Miss. Naushin F. Tamboli, Miss. Hazara Shaikh
No ratings yet
Perceived Parenting Style and Young Adult's Self-Efficacy: Miss. Naushin F. Tamboli, Miss. Hazara Shaikh
11 pages
Proc GLM - Sas User Guide
No ratings yet
Proc GLM - Sas User Guide
190 pages
2.2.1 Example: Income and Money Supply Using SIMPLIS Syntax: Example 4: Non-Recursive System
No ratings yet
2.2.1 Example: Income and Money Supply Using SIMPLIS Syntax: Example 4: Non-Recursive System
24 pages
Chap 015
No ratings yet
Chap 015
21 pages
The Impact of Red Light Cameras (Photo-Red Enforcement) On Crashes in Virginia
No ratings yet
The Impact of Red Light Cameras (Photo-Red Enforcement) On Crashes in Virginia
149 pages
Continuous Space PBIL Extensions
No ratings yet
Continuous Space PBIL Extensions
8 pages
(Ebook PDF) Fundraising Principles and Practice 2nd Edition Download
100% (1)
(Ebook PDF) Fundraising Principles and Practice 2nd Edition Download
126 pages
Quantitative vs Qualitative Research Guide
No ratings yet
Quantitative vs Qualitative Research Guide
22 pages
Handling Missing Data in R
No ratings yet
Handling Missing Data in R
30 pages
Econometrics: Wages & Unemployment
No ratings yet
Econometrics: Wages & Unemployment
9 pages
R Studio Notes
No ratings yet
R Studio Notes
10 pages
Qu Et Al 2018
No ratings yet
Qu Et Al 2018
20 pages

9 A.validation Methods - Jupyter Notebook

Uploaded by

9 A.validation Methods - Jupyter Notebook

Uploaded by

In

[1]: import numpy as np

In [2]: data = pd.read_csv("D:\\Course\\Python\\Datasets\\pima-indians-diabetes.csv")

In [5]: # Divide Data into X and Y

Hold Out Validations

In [6]: from sklearn.model_selection import train_test_split

In [7]: # Split the data

In [8]: model = LogisticRegression()

In [9]: model.fit(X_train, Y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Please also refer to the documentation for alternative solver options:

In [10]: # Predicting the Test set results

In [14]: # Initialize parameters

In [15]: # Fitting the model and Extracting the results

In [17]: print(results1.mean()*100.0, results1.std()*100.0)

Leave One Out Cross Validation

In [19]: # Initialize parameters

In [20]: # Fitting the model and Extracting the results

In [21]: print(results2.mean()*100.0, results2.std()*100.0)

Repeated K Fold Cross Validation

In [23]: # Initialize parameters

In [24]: # Fitting the model and Extracting the results

In [26]: # Check the Accuracy

Accuracy: [75.32467532 72.72727273 72.72727273 76.62337662 80.51948052 79.2207

75.32467532 86.84210526 67.10526316 80.26315789 81.81818182 77.92207792

80.51948052 76.62337662 76.62337662 80.51948052 75.32467532 67.10526316

In [27]: print(results3.mean()*100.0, results3.std()*100.0)

You might also like

In [17]: print(results1.mean()100.0, results1.std()100.0)

In [21]: print(results2.mean()100.0, results2.std()100.0)

In [27]: print(results3.mean()100.0, results3.std()100.0)