π‘ Passionate about data integration, API development, and cloud technologies
π Constantly learning and exploring new tools to build innovative solutions
π Here are some of my top projects:
-
π Medi Text Summarizer
Automated medical document data pipelines with Python, SQL, and cloud integration, achieving 92% accuracy in extraction. -
π Financial Risk Analysis
Developed a financial risk analysis model using Python and SQL, performing risk forecasting and data visualization to support investment decisions. -
π NBA Data Pipelines
Real-time ETL pipelines using SQL, Kafka, and PySpark to process 10,000+ records for sports analytics. -
π Corporate Training Hub
Built SQL-driven ETL workflows to streamline reporting pipelines, improving data validation by 95%.
- Languages: Python, Java, C++, SQL, Bash
- Databases: MySQL, PostgreSQL, Snowflake
- Big Data Frameworks: Apache Spark, PySpark, Hadoop, Kafka, Spring Boot, Flask
- Visualization: Tableau,PowerBI, Matplotlib, Seaborn, Plotly
- Tools & Platforms: Git, GitHub, Docker, Kubernetes, Airflow, Streamlit
- Cloud: AWS (EC2, S3, Lambda, Glue, Redshift), Microsoft Azure(Data Factory, Synapse, Blob storage), GCP
- ** Data Science & ML:** MLflow, SageMaker, Natural Language Processing(Spacy, NLTK), Hugging Face, Large Language Models(LLMS), RAG
- OS: Linux, Windows, UNIX
- Advanced Data Virtualization
- Big Data with PySpark and Kafka
- API Development with Java and FastAPI
π I once built a data pipeline just to analyze how much coffee I drink during debugging sessions.
Turns out... the pipeline processed more coffee than data. βπ
"The best way to predict the future is to invent it." β Alan Kay