Thanks to visit codestin.com
Credit goes to github.com

Skip to content

nunesdev/data-science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📊 Data Science & AI Portfolio

Python
SQL
Pandas
Scikit-Learn
Status


👨‍💻 About Me

Hi, I’m Bruno Reis, a software developer with 15+ years of experience, currently pursuing a Bachelor’s Degree in Data Science & Artificial Intelligence at Instituto Infnet, Brazil.

This repository showcases my journey in bridging my software engineering background with data-driven problem solving, focusing on statistical analysis, data engineering, and machine learning.


📂 Featured Projects

Description: Introduction to supervised learning through classification tasks.

  • Key Techniques: KNN algorithm (Instance-based learning), Binary vs. Multiclass classification, and performance evaluation using Accuracy and Confusion Matrices.
  • Goal: Identifying species based on biological measurements.

Description: A deep dive into the mathematical preparation of data for Machine Learning.

  • Key Techniques: Data discretization (Binning), Power Transformations (Yeo-Johnson), Scikit-Learn FunctionTransformer pipelines, and L2-Norm (Ridge) regularization.
  • Goal: Transforming raw data into optimized feature vectors.

Description: Comprehensive data quality diagnosis of the kc_house_data.csv dataset.

  • Key Techniques: Outlier detection (IQR, MAD, Z-Score), referential integrity verification, and logical consistency validation.

Description: Sales and profitability analysis using SQL queries (PostgreSQL).

  • Highlights: Time-based trends with DATE_TRUNC(), regional performance rankings, and customer growth tracking with window functions (LAG).

Description: Exploratory Data Analysis (EDA) of music popularity patterns.

  • Highlights: Correlation analysis and genre distribution visualization using Seaborn and Matplotlib.

🛠 Tech Stack

Area Tools & Technologies
Programming Python (Pandas, NumPy, Scikit-Learn), SQL
Data Engineering Feature Transformation, Binning, Data Cleaning, Regularization
Mathematics/ML Supervised Learning, Statistical Distributions, Linear Algebra foundations
Visualization Matplotlib, Seaborn
Background 15+ years in software development

🚀 Next Steps

  • Expand with projects in Unsupervised Learning (Clustering).
  • Deepen studies in Neural Networks and Deep Learning.
  • Implement Advanced Feature Selection techniques.
  • Build interactive dashboards with Streamlit.

✍️ Bruno Nunes Reis

About

My data science projects / models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages