D_K Deepakkasyapa11

👋 Hi, I'm Deepak kasyapa

AI Engineer|Data Enthusiast| Generative AI,LLMs & Data |Deep learning->Artificial Neural Nets |

Building production-grade AI systems by combining scalable data engineering, MLOps, and LLMOps practices to transform raw data into reliable, observable, and commercially impactful intelligence.

What I Build

Software Engineer with 2 years of experience architecting production LLM applications,Data Engineering, and enterprise RAG pipelines. Specialized in LLMOps, Generative AI, and scalable data infrastructure.

Core Expertise:

Production Data Engineering (Spark, Kafka, ETL at Scale)
RAG Architectures (Vector DBs, Hybrid Retrieval, Evaluation)
Real-Time AI Pipelines (Streaming, Low-Latency Inference)
LLMOps (Prompt Versioning, Cost Tracking, Observability)

🚀 Featured Projects

🎙️ Doc2VoiceAI - Real-Time Voice RAG

Voice-native document Q&A using Gemini 2.0 Multimodal Live API

Sub-200ms latency with streaming audio responses
ElevenLabs voice synthesis + Firestore vector search
Datadog observability for TTFT & token burn tracking
Stack: Gemini 2.0, Firestore, FastAPI, React, ElevenLabs

[MarketPulse - Real-Time Financial Sentiment Pipeline]

Enterprise-grade streaming pipeline for stock sentiment analysis

Processes 1000+ news articles/day via Kafka + Spark Streaming
87% sentiment accuracy using FinBERT on financial news
Live dashboard correlating stock prices with sentiment trends
Stack: Kafka, PySpark, HuggingFace, PostgreSQL, Streamlit, AWS

[Talent Insight - AI Resume Screening ATS]

Automated resume analysis with semantic matching

Skills gap analysis & upskilling recommendations
Semantic similarity scoring using embeddings
PDF parsing + LLM reasoning pipeline
Stack: Gemini 1.5 Flash, Python, PyPDF2, FastAPI

📝 Technical Writing

18 Deep-Dive Blog Posts covering production AI systems: Go check it ---

[MCP Deep Dive: Universal Connector for LLMs]
[NVIDIA Triton Servers for Production Inference]
[LLM Evaluation: Accuracy, Latency, Performance]
[Fixing LLM Bottlenecks with Custom CUDA Kernels]
[Finetune LLMs 2-5x Faster with Unsloth]
[Training on 1TB Datasets with 3GB RAM]
[Custom Keras Data Generators]

📚 View All 18 Articles →

Tech Stack

AI & LLM: LangChain • CrewAI • Swarm • LangGraph • HuggingFace Transformers Gemini 2.0 • GPT-4 • FinBERT FAISS • Pinecone • ChromaDB • Firestore Vector Search

Data Engineering: PySpark • Apache Kafka • Airflow • Great Expectations PostgreSQL • Redis • DynamoDB • Redshift AWS (S3, Lambda, Kinesis, EC2) • GCP (Cloud Run, Firestore)

Backend & APIs:

Python (FastAPI, Flask) • Node.js • TypeScript Docker • GitHub Actions • CI/CD

📊 GitHub Stats

📫 Let's Connect

📝 Technical Blog

Open to:

🤝 Collaboration on AI/LLM projects
💼 AI Engineer / Data Engineer roles

⭐ If you find my work interesting, consider starring my repositories!

"Building AI systems that are not just smart, but robust, observable, and cost-efficient."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly