EDA Helper is a Python package designed to streamline your Exploratory Data Analysis (EDA) process. It provides a collection of helper functions to quickly analyze, visualize, and summarize datasets. Whether you're working with numeric, categorical, or datetime data, this package has you covered!
This package is inspired by the brilliant work of @MisbahullahSheriff. The original EDA helper functions were created by him, and I have extended and organized them for easier use. Additional functions and improvements have been added by me (@shemanto27).
You can install the package via pip:
pip install my_eda_helperFor Google Colab users, install it directly in your notebook:
!pip install my_eda_helperimport my_eda_helper as edaFind Missing Values:
missing_data = eda.missing_info(df)
print(missing_data)Plot Missing Data:
eda.plot_missing_info(df)Numeric Features (Pearson/Spearman):
eda.correlation_heatmap(df)Categorical Features (Cramer's V):
eda.cramersV_heatmap(df)eda.pair_plots(df)Summary:
eda.num_summary(df, "Age")Univariate Plots:
eda.num_univar_plots(df, "Fare")Bivariate Plots:
eda.num_bivar_plots(df, "Age", "Fare")Summary:
eda.cat_summary(df, "Sex")Univariate Plots:
eda.cat_univar_plots(df, "Embarked")Bivariate Plots:
eda.num_cat_bivar_plots(df, "Fare", "Sex")Numeric vs Numeric:
eda.num_num_hyp_testing(df, "Age", "Fare")Numeric vs Categorical:
eda.num_cat_hyp_testing(df, "Fare", "Sex")Categorical vs Categorical:
eda.hyp_cat_cat(df, "Sex", "Survived")Contributions are welcome! If you have ideas for new features, improvements, or bug fixes, please feel free to:
- Fork the repository.
- Create a new branch:
git checkout -b feature/YourFeatureName
- Commit your changes:
git commit -m 'Add some feature' - Push to the branch:
git push origin feature/YourFeatureName
- Open a pull request.
Please ensure your code follows the project's style and includes appropriate tests.
This project is licensed under the MIT License. See the LICENSE file for details.
If you have any questions, suggestions, or issues, please open an issue on the GitHub repository.
Happy EDA! 🎉