The Machine Learning Sandbox App is designed to help users process CSV data and perform machine learning tasks with ease. The app allows users to upload CSV files, select target variable and feature from the remaining columns, configure preprocessing parameters, and evaluate models.
Co-Author: Malik Zekri
- Create a virtual environment:
conda create -n ml_app python=3.9 conda activate ml_app
- Install Requirements:
pip install -r requirements.txt
- Run Application:
python3 app.py
-
Upload a Dataset: Click on the "Upload CSV File" button and select a CSV file from your computer.
-
Select Target Variable: Choose the target variable from the dropdown menu. The app will display whether the target variable is discrete or continuous.
-
Select Features: Select the feature columns using the checkboxes. You can use the "Select All" and "Deselect All" buttons to quickly select or deselect all features.
-
Configure Preprocessing Parameters: Adjust the preprocessing parameters such as imputation, removal of invariant features, handling outliers, removal of linearly dependent features (set VIF threshold), and encoding method.
-
Submit: Click the "Submit" button to preprocess the data, train machine learning models, perform inference (classification or regression depending on target variable type), and evaluation. The results will be displayed, and the processed data will be saved to the output directory.
This repository includes some open-source CSV files located in the data folder. These files can be used to test the application.
Included Files:
-
data/Employee-Attrition.csv: Pulled from IBM HR Analytics Employee Attrition & Performance
-
data/loan_approval_dataset.csv: Pulled from Loan Approval Prediction Dataset
-
data/student_performance_data.csv: Pulled from 📚 Student Performance Dataset 📚
Usage of Data Files: You can use these data files by selecting them when uploading a CSV file in the application. They are provided for testing and demonstration purposes.
This project is licensed under the [MIT] License.