Iβm a Data Scientist with a strong focus on AI-driven solutions and a background in Data Analysis. I build end-to-end systems that combine data engineering, machine learning, and Generative AI to improve process automations, model predictions and business decisions.
π Location: Bellevue, Washington
This repository highlights selected projects that demonstrate my skills in AI-focused data science.
- Machine Learning & Deep Learning
- Generative AI & LLM applications
- Data Engineering with Spark & Databricks
- End-to-end AI systems (data β model β deployment)
- Languages: Python, SQL, R, DAX
- ML / AI: Scikit-learn, TensorFlow, PyTorch, LLMs, Prophet
- Big Data: Spark, Databricks, Delta Lake
- MLOps & Deployment: FastAPI, Docker, GitHub Actions,MLflow
- Visualization: Matplotlib, Seaborn, Streamlit
Below are selected projects designed to mirror real-world AI data science work, from data ingestion to model diagnostics and AI-powered insights.
Tech: Python, FastAPI, Pandas, LLMs
- Built an end-to-end ML analysis pipeline with automated performance summaries
- Combined traditional metrics with AI-generated insights for decision support
- Emphasized model evaluation, explainability, and stakeholder-ready outputs
π Repo: automated-business-insights
Tech: Python, Statsmodels, MLflow
- Implemented multiple forecasting models and compared performance
- Added residual diagnostics, model fit summaries, and error analysis
- Tracked experiments and metrics using MLflow
π Repo: time-series-forecasting-project
Tech: PySpark, Delta Lake, Databricks
- Built scalable data pipelines supporting ML workloads
- Implemented window functions, incremental processing, and Delta time travel
- Designed architecture with production ML systems in mind
π Repo: [spark-databricks-pipeline]
Tech: Python, LangChain, OpenAI API, Streamlit
- Uses LLMs to transform raw analytical outputs into human-readable insights, summaries, and reports.
π Repo: [llm-insight-generator]
- Applied ML & Generative AI (LLMs, prompt design, evaluation)
- Model training, performance metrics, and diagnostics
- Big data processing with Spark & Delta Lake
- AI-enabled APIs & backend systems
- Clear documentation, reproducibility, and stakeholder communication
β If you find these projects useful, feel free to star the repos!