Thanks to visit codestin.com
Credit goes to github.com

Skip to content

kamiwaza-exp/eda_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 

Repository files navigation

Exploratory Data Analysis (EDA) Project

This project demonstrates a simple workflow for performing exploratory data analysis (EDA) on an open dataset. We’ll download the Palmer Penguins dataset, explore it, create visualizations, and finally generate an HTML report.


πŸ“₯ 1. Download the Dataset

mkdir -p data
wget https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv -O data/penguins.csv

βš™οΈ 2. Setup Environment

python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate
pip install pandas matplotlib seaborn jupyter jupyterlab ydata-profiling

πŸ“Š 3. First Exploration

Start Jupyter Lab:

jupyter lab

In a notebook (notebooks/01_exploration.ipynb):

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load data
df = pd.read_csv("../data/penguins.csv")

# Quick overview
print(df.head())
print(df.info())
print(df.describe())

# Simple visualization
sns.pairplot(df.dropna(), hue="species")
plt.show()

πŸ“‘ 4. Generate an HTML Report

Use ydata-profiling (formerly pandas-profiling) to automatically create an EDA report:

from ydata_profiling import ProfileReport

profile = ProfileReport(df, title="Penguins EDA Report", explorative=True)
profile.to_file("reports/penguins_report.html")

The report will be saved at:

reports/penguins_report.html

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published