Viraj Jadhao Viraj281105

👋 About Me

I'm a Computer Engineering student at Savitribai Phule Pune University (CGPA: 9.136/10, 2023–2027) and a Generative AI Engineer who builds production-grade AI systems — from multi-agent pipelines and causal inference engines to full-stack, containerized, deployable products.

Most student AI projects are tutorials with a frontend slapped on. Mine aren't. I've shipped a 5-agent medical billing auditor that generates IRDAI-compliant appeal letters in under 15 seconds, a causal policy simulation engine over 700K+ records, and an NLI-based hallucination detector benchmarked across 5+ LLM variants — the kind of work that holds up under scrutiny from engineers, not just professors.

What I'm building right now:

🤖 Multi-agent RAG systems with LangGraph — document ingestion, regulatory reasoning, QA judges
🧠 Causal representation learning & counterfactual policy simulation under distribution shift
🔍 LLM factuality pipelines — NLI-based hallucination detection with probabilistic calibration
🏗️ Full-stack AI products: Spring Boot + FastAPI backends, Next.js frontends, Dockerized deployments

I'm actively looking for: AI/ML internships, SWE/MLE roles, and open-source collaborations where I can work on problems that matter.

🧠 Technical Stack

I work across the full depth of an AI system — from mathematical foundations to shipped, containerized products.

🤖 AI / ML & Research

Depth: Transformer architectures (self-attention from scratch), LoRA/adapter fine-tuning, DeBERTa/RoBERTa NLI-based entailment & calibration (ECE, precision-recall), counterfactual simulation under do(·) interventions, DAG identifiability (backdoor/frontdoor), multi-agent orchestration with LangGraph, RAG with hybrid sparse-dense retrieval (FAISS + BM25), Stable Diffusion, cGAN (Pix2Pix U-Net + PatchGAN), Neural Style Transfer, NLP-to-SQL, time-series forecasting, Isolation Forest anomaly detection.

⚙️ Backend & Infrastructure

Depth: Modular microservice design (Spring Boot + FastAPI tri-service architectures), async ingestion/query decoupling, SSE streaming for live agent output, JWT auth, hot-swappable model backends, containerized deployments with Docker Compose + Nginx + SSL, scalable inference serving, embedding pipeline architecture.

🗄️ Databases

Depth: Vector similarity search (pgvector + FAISS), hybrid sparse-dense retrieval, relational schema design, query optimization, geospatial indexing with PostGIS, sub-second semantic lookup over 3.5M+ records, Redis Streams for real-time event-driven pipelines.

🎨 Frontend & Visualization

Depth: Next.js 15 App Router with SSE streaming, Zustand state management, D3.js + Recharts + Plotly for multi-layer data visualization, Mapbox geospatial maps, Framer Motion animations, Gradio/Streamlit for rapid AI prototyping interfaces.

💻 Languages

🚀 Projects

Full-stack AI systems built with production discipline — not just notebooks.

🏥 MedGuard — AI Medical Billing Auditor & Insurance Appeal Engine

The Problem: Hospital bills are opaque, overcharged, and nearly impossible for patients to dispute. Insurance appeal letters require deep regulatory knowledge (IRDAI/CGHS) that no patient has — and no existing tool provides.

The Approach:

Architected a 5-agent sequential pipeline: Document Auditor → Clinical Reviewer → Regulatory Advisor → Appeal Drafter → QA Judge — each agent with a distinct role, handoff protocol, and failure mode
Parsed hospital bills using LayoutLMv3 + EasyOCR; benchmarked extracted charges against official CGHS rates for line-item anomaly detection
Implemented RAG over IRDAI circulars using FAISS + sentence-transformers for zero-hallucination regulatory citations
Streamed live agent output via SSE (Server-Sent Events) for real-time UX — no polling, no waiting
Generated IRDAI-compliant appeal letters in under 15 seconds end-to-end

Architecture:

Hospital Bill (PDF) → LayoutLMv3 + EasyOCR → Document Auditor Agent
                                                        ↓
                                          Clinical Reviewer Agent
                                                        ↓
                              RAG over IRDAI Circulars (FAISS + sentence-transformers)
                                                        ↓
                                         Regulatory Advisor Agent
                                                        ↓
                                           Appeal Drafter Agent
                                                        ↓
                                    QA Judge Agent → SSE Stream → Next.js 15 UI

Results: Full appeal letter generated in <15 seconds; zero hallucination on regulatory citations via evidence-grounded RAG; live streamed agent output via SSE.

Next.js 15 FastAPI PostgreSQL pgvector FAISS LangGraph LayoutLMv3 EasyOCR sentence-transformers Docker

🌀 ClimateX — India Climate Intelligence Platform

The Problem: Climate policy decisions in India are made without rigorous counterfactual modeling — policymakers can't answer "what would Delhi's AQI look like if the Clean Fuel Subsidy hadn't been passed?" Correlation-based ML offers no causal guarantees.

The Approach:

Built a 4-module platform for Indian policymakers: real-time AQI + weather maps (IMD/CPCB data via Mapbox + Recharts), DoWhy-inspired causal engine, RAG pipeline, and live sentiment analysis
Specified structural equation models (SEMs) capturing causal relationships between policy interventions, emissions, economic output, and public health outcomes
Implemented ATE and CATE estimators with do(X = x) intervention semantics; conducted sensitivity analysis (Rosenbaum bounds, E-values) to quantify robustness under latent confounding
Built RAG pipeline over MoEFCC and UN reports using pgvector for state-level evidence-grounded policy recommendations
Implemented live sentiment analysis (BERT/RoBERTa) over Twitter + news streams with state-wise topic clustering
Async ingestion pipelines over 700,000+ heterogeneous policy and climate records with sub-second semantic retrieval

Architecture:

IMD/CPCB/Twitter Streams → Async Ingestion → PostgreSQL + pgvector + MongoDB
                                                        ↓
                              SCM / DAG Construction (DoWhy) ←→ RAG over MoEFCC Reports
                                                        ↓
                         Counterfactual Simulation Engine (do-calculus)
                         Sentiment Analysis (BERT/RoBERTa) — state-wise clustering
                                                        ↓
                    FastAPI Backend → React + Mapbox + Recharts + Plotly Dashboard

Results: Reproducible counterfactual policy simulations with calibrated uncertainty; state-level sentiment clustering over live news and social streams; full offline demo fallback.

Python DoWhy FastAPI pgvector React Mapbox Recharts Plotly BERT RoBERTa PostgreSQL MongoDB Docker

🔍 LLM Hallucination Detection — Production-Ready Verification Pipeline

The Problem: LLMs hallucinate confidently. Existing detection methods are either too shallow (keyword matching) or too slow (full re-generation). There's no lightweight, calibrated, production-deployable solution.

The Approach:

Formulated hallucination detection as a conditional inference problem: given retrieved evidence E and generated claim C, estimate P(entailment | E, C)
Fine-tuned DeBERTa-based NLI classifiers on domain-adapted QA corpora; evaluated calibration rigorously via precision–recall curves, ECE (Expected Calibration Error), and confidence distribution analysis across answer confidence bins
Identified systematic degradation under retrieval noise, semantic drift, and distribution shift; exposed failure modes including overconfident contradiction misclassification and hallucination in low-evidence contexts
Designed independent microservices for retrieval, entailment, and confidence scoring — enabling hot-swappable model backends
Explored continual adaptation mechanisms to mitigate model drift as LLMs evolve

Architecture:

LLM Response → Dense Vector Retrieval (pgvector)
                        ↓
            Evidence Ranking & Context Assembly
                        ↓
         DeBERTa NLI Entailment Scoring (ECE-calibrated)
                        ↓
    Confidence Calibration → SHAP XAI Verdict (Gradio UI)

Results: >25% improvement in factual reliability on benchmark datasets; benchmarked across 5+ LLM variants on factuality, precision, and ECE calibration metrics.

DeBERTa RoBERTa HuggingFace RAG pgvector FastAPI Gradio SHAP PostgreSQL Continual Learning

📦 More Projects

🌊 FloatChat — NL to SQL over Ocean Data

Schema-aware NLP-to-SQL engine over 3.5M+ ARGO float oceanographic records. Integrates geospatial query execution (PostGIS), dynamic parameter binding, and live Plotly visualizations — enabling non-technical researchers to explore marine datasets through natural language alone. Decoupled API layer separates NL interpretation from SQL execution.

💰 FinGuard AI — Personal Finance Risk & Simulation Engine

Tri-service platform ingesting Indian bank statements from 5 major banks, classifying transactions at 91% accuracy (sentence-transformers + FAISS), and computing a 5-dimension Financial Risk Index (liquidity, debt burden, volatility, savings stability, investment diversification). Scenario engine lets users adjust parameters and see 6–12 month cash flow forecasts.

⚖️ Real-Time Multi-Agent Governance (RL)

Event-driven multi-agent system where financial market agents propose actions, negotiate conflicts, and self-enforce governance rules using PPO/SAC from first principles. Models Nash bargaining, meritocratic voting, and emergent equilibrium stability under real market data.

💓 ECG FPGA Accelerator

Low-power streaming 1D CNN accelerator for real-time ECG arrhythmia detection on FPGA hardware. Hardware-accelerated deep learning for medical AI — built for the FPGA Hackathon 2026. Optimized for minimal latency and power budget on embedded inference.

📜 Certifications & Credentials

🤖 AI & Machine Learning Google x Hack2Skill — 5-Day AI Agents Intensive Course AICTE Virtual Internship — Google AI/ML Program AI Adventures (Google) — ML, DL & GenAI Specializations HCL GUVI — AI/ML Certification	☁️ Cloud & Infrastructure AWS — Cloud Foundations Prodigy InfoTech — Generative AI Internship
💻 Programming & Development IIT Bombay — Core Java (Spoken Tutorial) Udemy — Python, SQL & C++ Bootcamp	🏆 Hackathons & Workshops Adobe India Hackathon — Participant Certificate SKNCOE Hackathon — Certificate COEP — Video Editing Workshop ARVR Workshop — Certificate

🧭 Leadership & Activities

Hackathon Director (College-Wide) — Sole organizer and SPOC for an institution-wide hackathon spanning 160 teams and 500+ participants; owned all logistics, scheduling, judging coordination, and real-time decision-making end-to-end. Largest technical event run by a single student at the college.
Vice Chair, ACM Student Chapter — Directed 6+ technical workshops on ML systems, LLMs, and competitive programming reaching 1,000+ students; represented the chapter at 10+ external hackathons with consistent top-5/10 finishes; coordinated inter-college hackathon partnerships and peer research initiatives.
Sponsorship Lead, MPulse Technical Fest — Closed 30+ sponsors and raised ₹1 lakh+ through end-to-end acquisition (cold outreach, pitch decks, negotiations); largely funded the college technical fest with 500+ attendees.
AI & Computer Vision Lead, Team Vulcans Robotics — Designed the computer vision pipeline for ABU Robocon 2026 from scratch; deployed real-time object detection and localization systems on embedded hardware; mentored 4 junior contributors on model integration, quantization, deployment workflows, and code review; represented the team at state-level robotics competitions.
Team Lead, SIH · SKNCOE Fusion 2025 · PCCOE IGC Hackathons — Directed multi-disciplinary teams delivering full-stack AI prototypes under tight deadlines; consistent top finishes across national and college-level competitions.

📊 GitHub Activity

🤝 Let's Connect

Open to AI/ML internships, SWE/MLE roles, and research collaborations.

I build at the intersection of causal AI, LLM reliability, and multi-agent systems —
if you're working on something in that space, or just want to talk about a hard problem, reach out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Viraj Jadhao Viraj281105

Achievements

Achievements

Highlights

Block or report Viraj281105

👋 About Me

🧠 Technical Stack

🤖 AI / ML & Research

⚙️ Backend & Infrastructure

🗄️ Databases

🎨 Frontend & Visualization

💻 Languages

🚀 Projects

🏥 MedGuard — AI Medical Billing Auditor & Insurance Appeal Engine

🌀 ClimateX — India Climate Intelligence Platform

🔍 LLM Hallucination Detection — Production-Ready Verification Pipeline

📦 More Projects

🌊 FloatChat — NL to SQL over Ocean Data

💰 FinGuard AI — Personal Finance Risk & Simulation Engine

⚖️ Real-Time Multi-Agent Governance (RL)

💓 ECG FPGA Accelerator

📜 Certifications & Credentials

🤖 AI & Machine Learning

☁️ Cloud & Infrastructure

💻 Programming & Development

🏆 Hackathons & Workshops

🧭 Leadership & Activities

📊 GitHub Activity

🤝 Let's Connect

Pinned Loading

Uh oh!