Thanks to visit codestin.com
Credit goes to github.com

Skip to content

innichang/wine-quality-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wine Quality Prediction

A machine learning project that predicts wine quality based on various physicochemical properties. Using Wine Quality Dataset from https://www.kaggle.com/datasets/yasserh/wine-quality-dataset

Project Structure

wine-quality-prediction/
├── data/                   # Dataset directory
│   └── wine_quality.csv   # Wine dataset (to be added)
├── models/                 # Trained models
├── src/                    # Source code
│   ├── data_loader.py     # Data loading and preprocessing
│   ├── model_trainer.py   # Model training utilities
│   └── predictor.py       # Prediction utilities
├── notebooks/              # Jupyter notebooks
│   └── wine_analysis.ipynb # Data analysis notebook
├── train_model.py         # Main training script
└── requirements.txt       # Python dependencies

Setup

  1. Install dependencies:

    pip install -r requirements.txt
  2. Add your dataset:

    • Place your wine quality dataset as data/wine_quality.csv
    • Expected format: CSV with wine attributes and a 'quality' column
    • Common wine attributes: fixed_acidity, volatile_acidity, citric_acid, residual_sugar, chlorides, free_sulfur_dioxide, total_sulfur_dioxide, density, pH, sulphates, alcohol

Usage

Training Models

Run the main training script:

python train_model.py

This will:

  • Load and preprocess the data
  • Train multiple ML models (Random Forest, Gradient Boosting, Linear Regression, SVR)
  • Compare model performance
  • Save the best model to models/

Making Predictions

from src.predictor import WineQualityPredictor

# Initialize predictor with trained model
predictor = WineQualityPredictor('models/wine_quality_model.joblib')

# Predict single wine quality
wine_features = {
    'fixed_acidity': 7.4,
    'volatile_acidity': 0.7,
    'citric_acid': 0.0,
    'residual_sugar': 1.9,
    'chlorides': 0.076,
    'free_sulfur_dioxide': 11.0,
    'total_sulfur_dioxide': 34.0,
    'density': 0.9978,
    'pH': 3.51,
    'sulphates': 0.56,
    'alcohol': 9.4
}

predicted_quality = predictor.predict_single_wine(wine_features)
print(f"Predicted quality: {predicted_quality:.2f}")

Data Analysis

Open the Jupyter notebook for interactive analysis:

jupyter notebook notebooks/wine_analysis.ipynb

Models

The project includes several ML algorithms:

  • Random Forest: Ensemble method, good baseline
  • Gradient Boosting: Often performs well on tabular data
  • Linear Regression: Simple interpretable model
  • SVR: Support Vector Regression

Features

  • Data Loading: Flexible CSV loading with preprocessing
  • Model Training: Multiple algorithms with cross-validation
  • Model Comparison: Automatic comparison of different models
  • Feature Importance: Analysis of which features matter most
  • Prediction Confidence: Uncertainty estimation for ensemble models
  • Batch Prediction: Process multiple wines at once

Dataset

Expected wine attributes (typical wine quality dataset):

  • fixed_acidity: Fixed acidity level
  • volatile_acidity: Volatile acidity level
  • citric_acid: Citric acid content
  • residual_sugar: Residual sugar content
  • chlorides: Chloride content
  • free_sulfur_dioxide: Free sulfur dioxide
  • total_sulfur_dioxide: Total sulfur dioxide
  • density: Wine density
  • pH: pH level
  • sulphates: Sulphate content
  • alcohol: Alcohol percentage
  • quality: Target variable (wine quality score)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published