GenAI Engineer | Full-Stack Developer | Open Source Contributor
AI Infrastructure — Designing systems for large-scale AI workloads (B200/H200 clusters)
RAG Applications — Multi-tenant document querying with proper data isolation
LLM Deployment — Making model deployment less painful for other developers
LLM deployment orchestration platform. FastAPI backend, Next.js dashboard, LiteLLM for unified provider interface. Actually runs in production.
Text-to-speech library using Google's Gemini API. Async-first, 24 languages, Redis queue for batch processing, multi-speaker dialogue support.
Video processing pipeline that converts to HLS format. RabbitMQ for job queuing, Kubernetes-ready with Helm charts.
Docker image for streaming cloud-stored media through Plex. Handles secure mounting and streaming optimization.
Helicone AI Gateway (Rust/Tokio) — Replaced generic error types with specific variants, refactored monolithic functions, eliminated 100+ lines of duplicated code.
Arsky Project — TypeScript migration, ESLint/Prettier setup, GitHub Actions CI/CD.
Spheron CLI — Updated TypeScript definitions, fixed deployment configs.
Gdu (Go) — Fixed nil pointer crashes in file handling.
Languages: JavaScript, TypeScript, Python, C++, some Rust and Go
AI/ML: Hugging Face, LiteLLM, OpenAI, RAG pipelines, vector databases, LoRA fine-tuning
Backend: FastAPI, Express.js, Node.js | Frontend: Next.js, React
Infrastructure: GCP, AWS, Docker, Kubernetes, Helm, Prometheus
Databases: PostgreSQL, MongoDB, Scylla, Redis
Email: [email protected] | GitHub: ShivamB25 | LinkedIn: shivambansal