Iris Flower Dataset Analysis

This repository contains two complementary approaches to analyzing the famous Iris dataset:

Machine Learning Classification (Python) - Predicting species using scikit-learn.
Statistical Analysis (R) - Hypothesis testing and exploratory data analysis (EDA).

1. Machine Learning Classification (Python)

Overview

A step-by-step workflow to classify Iris flower species (setosa, versicolor, virginica) using scikit-learn's K-Nearest Neighbors (KNN) algorithm.

Key Steps

Data Exploration: Visualizing feature distributions and pairwise relationships.
Model Training: Implementing KNN classification.
Evaluation: Scoring model accuracy and analyzing confusion matrices.

Results

Achieved 96% accuracy in species prediction using KNN (k=3).

2. Statistical Analysis (R)

Overview

A rigorous statistical examination of the Iris dataset, including:

ANOVA to test for significant differences in sepal/petal measurements across species.
Tukey HSD post-hoc tests to identify which species pairs differ.
Data Visualization: Boxplots, confidence intervals, and pairwise comparison plots.

Key Findings

Sepal Length: Highly significant differences between all species (ANOVA p < 0.001).
Petal Width: Virginica petals are significantly wider than versicolor and setosa (Tukey p < 0.01).

Why Both Approaches?

Machine Learning (Python): Focuses on predictive accuracy for species classification.
Statistical Analysis (R): Explains why the model works by quantifying biological differences between species.

Dependencies

Python: scikit-learn, pandas, matplotlib, Jupyter.
R: ggplot2, rstatix, dplyr.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitattributes		.gitattributes
IrisExploration.R		IrisExploration.R
IrisExploration_explained.R		IrisExploration_explained.R
Iris_ML_intro.ipynb		Iris_ML_intro.ipynb
Iris_kneighbors_classifier.ipynb		Iris_kneighbors_classifier.ipynb
Iris_logistic_regression.ipynb		Iris_logistic_regression.ipynb
README.md		README.md
book_Kneighbors_iris.ipynb		book_Kneighbors_iris.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Iris Flower Dataset Analysis

1. Machine Learning Classification (Python)

Overview

Key Steps

Results

2. Statistical Analysis (R)

Overview

Key Findings

Why Both Approaches?

Dependencies

About

Uh oh!

Releases

Packages

Languages

pablo-ferro/ML_iris_flower

Folders and files

Latest commit

History

Repository files navigation

Iris Flower Dataset Analysis

1. Machine Learning Classification (Python)

Overview

Key Steps

Results

2. Statistical Analysis (R)

Overview

Key Findings

Why Both Approaches?

Dependencies

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages