Data Scientist | NLP Engineer | Machine Learning & MLOps Specialist
π§ [email protected] β’ LinkedIn β’ GitHub
π Ghent, Belgium β’ π (+32) 4 707 569 17
Results-driven Data Scientist with deep expertise in Machine Learning, Natural Language Processing, and MLOps. I specialize in developing, fine-tuning, and deploying robust AI solutionsβespecially in NLP (LLMs, RAG) and Recommender Systems. My core strength is end-to-end MLOps on Azure and AWS, building robust CI/CD pipelines, optimizing AI endpoints, and managing scalable vector databases. Passionate about delivering production-grade solutions that drive real-world impact.
-
MSc Statistical Data Analysis
Ghent University, Belgium -
MTech Structural Engineering
Vellore Institute of Technology, India | CGPA: 8.7 -
BE Civil Engineering
Anna University, India | CGPA: 7.5
- Developed & deployed production-ready NLP models (LLMs) using Azure ML for training, fine-tuning, and scalable endpoint creation.
- Reduced core app latency by 30% via advanced quantization and performance optimization.
- Fine-tuned embedding/re-ranker models, improving recommendation relevance by 14%.
- Built an extreme multi-label classification model for precise skill classification.
- Implemented RLHF for LLMs, further improving recommendation systems.
- Architected a dynamic multi-agent RAG system for a continuously growing database of skills/job activities.
- Led end-to-end MLOps: Azure Cloud, Azure Container Apps, CI/CD (GitHub Actions), Docker, ACR.
- Mentored interns in NLP model development and scalable data pipelines.
- Tackled vaccine demand forecasting using hierarchical time series models.
- Calibrated CNN models with Bayesian hyperparameter tuning to address data drift.
- Identified MinT and WLSS as top reconciliation methods, reducing forecast error.
- Analyzed COVID-19βs impact on sales data and model robustness.
- Sequential-Learning-NLP-BERT: Automated writing skill classification and rating using BERT.
- Inception-v3-CNN: Multi-class vehicle orientation classification for autonomous driving.
- MRI-Scan-Segmentation-U-Net: Deep learning model for MRI scan segmentation in radiotherapy planning.
- Forecast Reconciliation for Vaccine Supply Chain Optimization
Angam, B. et al., arXiv:2305.01455v1- Pioneered hierarchical time series modeling at GSK; compared state-of-the-art reconciliation methods; established robust model selection and evaluation.
- SAS Certified Specialist: Base Programmer
- Data Science using Python (IIT Roorkee) β Credential ID: L202878A24B
- Certified Associate Data Scientist β DataCamp
| Category | Skills & Tools | Application Context |
|---|---|---|
| Programming | Python, R, SQL | ML/NLP models, APIs, data pipelines, statistical analysis |
| ML & Deep Learning | CNNs, BERT, Transformers, Transfer Learning, Regression, Time Series, Recommender Systems, RLHF | Production LLMs, forecasting, image classification/segmentation, recommendation relevance |
| NLP | LLMs, RAG (Multi-Agent), Embedding Models, Re-rankers, Semantic Matching, Multi-label Classification | Production NLP, skill ontology, semantic search engines |
| Cloud & MLOps | Microsoft Azure, Azure ML, Azure Container Apps, CI/CD (GitHub Actions), Docker, Terraform, ACR, Flask/FastAPI | Model deployment, monitoring, endpoint optimization, CI/CD pipelines |
| Data Engineering | ETL, Tableau, High-Dimensional Data Analysis, A/B Testing, HNSW Indexing, Web Crawling (Python/Selenium) | Vector DBs, data scraping, behavior analysis, predictive analysis, data transformation |
Professional references available upon request.



