Exploratory Data Analysis (EDA) Project

This project demonstrates a simple workflow for performing exploratory data analysis (EDA) on an open dataset. We’ll download the Palmer Penguins dataset, explore it, create visualizations, and finally generate an HTML report.

📥 1. Download the Dataset

mkdir -p data
wget https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv -O data/penguins.csv

⚙️ 2. Setup Environment

python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate
pip install pandas matplotlib seaborn jupyter jupyterlab ydata-profiling

📊 3. First Exploration

Start Jupyter Lab:

jupyter lab

In a notebook (notebooks/01_exploration.ipynb):

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load data
df = pd.read_csv("../data/penguins.csv")

# Quick overview
print(df.head())
print(df.info())
print(df.describe())

# Simple visualization
sns.pairplot(df.dropna(), hue="species")
plt.show()

📑 4. Generate an HTML Report

Use ydata-profiling (formerly pandas-profiling) to automatically create an EDA report:

from ydata_profiling import ProfileReport

profile = ProfileReport(df, title="Penguins EDA Report", explorative=True)
profile.to_file("reports/penguins_report.html")

The report will be saved at:

reports/penguins_report.html

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Exploratory Data Analysis (EDA) Project

📥 1. Download the Dataset

⚙️ 2. Setup Environment

📊 3. First Exploration

📑 4. Generate an HTML Report

About

Uh oh!

Releases

Packages

kamiwaza-exp/eda_demo

Folders and files

Latest commit

History

Repository files navigation

Exploratory Data Analysis (EDA) Project

📥 1. Download the Dataset

⚙️ 2. Setup Environment

📊 3. First Exploration

📑 4. Generate an HTML Report

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages