AI Infra Engineer Β· LLM/Agentic Systems Β· Cost Optimization Β· AKS/Terraform Β· NVIDIA + OSS AI
I build efficient AI infrastructure β from optimized GPU clusters to fast LLM serving (vLLM, Triton, SGLang), agentic workflows (LangGraph/CrewAI), and cost-aware pipelines.
A 14-day AI Infra portfolio showcasing:
- GPU cost savings (Spot+OD, autoscaling, DCGM dashboards)
- LLM serving benchmarks: Triton vs vLLM vs TGI vs SGLang
- Quantization + speculative decoding
- Long-context efficiency (128kβ1M tokens)
- RAG cost optimization
- Multi-agent orchestration cost tracing
- CICD for AI systems (GitHub Actions β AKS)
π rohankataria.com
π linkedin.com/in/imrohan
π€ huggingface.co/imrohankataria
πΈ instagram.com/byrohankataria



