Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View Viraj281105's full-sized avatar
πŸ˜ƒ
πŸ˜ƒ

Highlights

  • Pro

Block or report Viraj281105

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Viraj281105/README.md

Viraj Jadhao Banner


Typing SVG


πŸ‘‹ About Me

I'm a Computer Engineering student at Savitribai Phule Pune University (CGPA: 9.136/10, 2023–2027) and a Generative AI Engineer who builds production-grade AI systems β€” from multi-agent pipelines and causal inference engines to full-stack, containerized, deployable products.

Most student AI projects are tutorials with a frontend slapped on. Mine aren't. I've shipped a 5-agent medical billing auditor that generates IRDAI-compliant appeal letters in under 15 seconds, a causal policy simulation engine over 700K+ records, and an NLI-based hallucination detector benchmarked across 5+ LLM variants β€” the kind of work that holds up under scrutiny from engineers, not just professors.

What I'm building right now:

  • πŸ€– Multi-agent RAG systems with LangGraph β€” document ingestion, regulatory reasoning, QA judges
  • 🧠 Causal representation learning & counterfactual policy simulation under distribution shift
  • πŸ” LLM factuality pipelines β€” NLI-based hallucination detection with probabilistic calibration
  • πŸ—οΈ Full-stack AI products: Spring Boot + FastAPI backends, Next.js frontends, Dockerized deployments

I'm actively looking for: AI/ML internships, SWE/MLE roles, and open-source collaborations where I can work on problems that matter.

Β  Β 


🧠 Technical Stack

I work across the full depth of an AI system β€” from mathematical foundations to shipped, containerized products.


πŸ€– AI / ML & Research

AI ML Icons Β 

Depth: Transformer architectures (self-attention from scratch), LoRA/adapter fine-tuning, DeBERTa/RoBERTa NLI-based entailment & calibration (ECE, precision-recall), counterfactual simulation under do(Β·) interventions, DAG identifiability (backdoor/frontdoor), multi-agent orchestration with LangGraph, RAG with hybrid sparse-dense retrieval (FAISS + BM25), Stable Diffusion, cGAN (Pix2Pix U-Net + PatchGAN), Neural Style Transfer, NLP-to-SQL, time-series forecasting, Isolation Forest anomaly detection.


βš™οΈ Backend & Infrastructure

Backend Icons Β 

Depth: Modular microservice design (Spring Boot + FastAPI tri-service architectures), async ingestion/query decoupling, SSE streaming for live agent output, JWT auth, hot-swappable model backends, containerized deployments with Docker Compose + Nginx + SSL, scalable inference serving, embedding pipeline architecture.


πŸ—„οΈ Databases

Database Icons Β 

Depth: Vector similarity search (pgvector + FAISS), hybrid sparse-dense retrieval, relational schema design, query optimization, geospatial indexing with PostGIS, sub-second semantic lookup over 3.5M+ records, Redis Streams for real-time event-driven pipelines.


🎨 Frontend & Visualization

Frontend Icons Β 

Depth: Next.js 15 App Router with SSE streaming, Zustand state management, D3.js + Recharts + Plotly for multi-layer data visualization, Mapbox geospatial maps, Framer Motion animations, Gradio/Streamlit for rapid AI prototyping interfaces.


πŸ’» Languages

Languages Icons


πŸš€ Projects

Full-stack AI systems built with production discipline β€” not just notebooks.


πŸ₯ MedGuard β€” AI Medical Billing Auditor & Insurance Appeal Engine

The Problem: Hospital bills are opaque, overcharged, and nearly impossible for patients to dispute. Insurance appeal letters require deep regulatory knowledge (IRDAI/CGHS) that no patient has β€” and no existing tool provides.

The Approach:

  • Architected a 5-agent sequential pipeline: Document Auditor β†’ Clinical Reviewer β†’ Regulatory Advisor β†’ Appeal Drafter β†’ QA Judge β€” each agent with a distinct role, handoff protocol, and failure mode
  • Parsed hospital bills using LayoutLMv3 + EasyOCR; benchmarked extracted charges against official CGHS rates for line-item anomaly detection
  • Implemented RAG over IRDAI circulars using FAISS + sentence-transformers for zero-hallucination regulatory citations
  • Streamed live agent output via SSE (Server-Sent Events) for real-time UX β€” no polling, no waiting
  • Generated IRDAI-compliant appeal letters in under 15 seconds end-to-end

Architecture:

Hospital Bill (PDF) β†’ LayoutLMv3 + EasyOCR β†’ Document Auditor Agent
                                                        ↓
                                          Clinical Reviewer Agent
                                                        ↓
                              RAG over IRDAI Circulars (FAISS + sentence-transformers)
                                                        ↓
                                         Regulatory Advisor Agent
                                                        ↓
                                           Appeal Drafter Agent
                                                        ↓
                                    QA Judge Agent β†’ SSE Stream β†’ Next.js 15 UI

Results: Full appeal letter generated in <15 seconds; zero hallucination on regulatory citations via evidence-grounded RAG; live streamed agent output via SSE.

Next.js 15 FastAPI PostgreSQL pgvector FAISS LangGraph LayoutLMv3 EasyOCR sentence-transformers Docker

Repo


πŸŒ€ ClimateX β€” India Climate Intelligence Platform

The Problem: Climate policy decisions in India are made without rigorous counterfactual modeling β€” policymakers can't answer "what would Delhi's AQI look like if the Clean Fuel Subsidy hadn't been passed?" Correlation-based ML offers no causal guarantees.

The Approach:

  • Built a 4-module platform for Indian policymakers: real-time AQI + weather maps (IMD/CPCB data via Mapbox + Recharts), DoWhy-inspired causal engine, RAG pipeline, and live sentiment analysis
  • Specified structural equation models (SEMs) capturing causal relationships between policy interventions, emissions, economic output, and public health outcomes
  • Implemented ATE and CATE estimators with do(X = x) intervention semantics; conducted sensitivity analysis (Rosenbaum bounds, E-values) to quantify robustness under latent confounding
  • Built RAG pipeline over MoEFCC and UN reports using pgvector for state-level evidence-grounded policy recommendations
  • Implemented live sentiment analysis (BERT/RoBERTa) over Twitter + news streams with state-wise topic clustering
  • Async ingestion pipelines over 700,000+ heterogeneous policy and climate records with sub-second semantic retrieval

Architecture:

IMD/CPCB/Twitter Streams β†’ Async Ingestion β†’ PostgreSQL + pgvector + MongoDB
                                                        ↓
                              SCM / DAG Construction (DoWhy) ←→ RAG over MoEFCC Reports
                                                        ↓
                         Counterfactual Simulation Engine (do-calculus)
                         Sentiment Analysis (BERT/RoBERTa) β€” state-wise clustering
                                                        ↓
                    FastAPI Backend β†’ React + Mapbox + Recharts + Plotly Dashboard

Results: Reproducible counterfactual policy simulations with calibrated uncertainty; state-level sentiment clustering over live news and social streams; full offline demo fallback.

Python DoWhy FastAPI pgvector React Mapbox Recharts Plotly BERT RoBERTa PostgreSQL MongoDB Docker

Repo


πŸ” LLM Hallucination Detection β€” Production-Ready Verification Pipeline

The Problem: LLMs hallucinate confidently. Existing detection methods are either too shallow (keyword matching) or too slow (full re-generation). There's no lightweight, calibrated, production-deployable solution.

The Approach:

  • Formulated hallucination detection as a conditional inference problem: given retrieved evidence E and generated claim C, estimate P(entailment | E, C)
  • Fine-tuned DeBERTa-based NLI classifiers on domain-adapted QA corpora; evaluated calibration rigorously via precision–recall curves, ECE (Expected Calibration Error), and confidence distribution analysis across answer confidence bins
  • Identified systematic degradation under retrieval noise, semantic drift, and distribution shift; exposed failure modes including overconfident contradiction misclassification and hallucination in low-evidence contexts
  • Designed independent microservices for retrieval, entailment, and confidence scoring β€” enabling hot-swappable model backends
  • Explored continual adaptation mechanisms to mitigate model drift as LLMs evolve

Architecture:

LLM Response β†’ Dense Vector Retrieval (pgvector)
                        ↓
            Evidence Ranking & Context Assembly
                        ↓
         DeBERTa NLI Entailment Scoring (ECE-calibrated)
                        ↓
    Confidence Calibration β†’ SHAP XAI Verdict (Gradio UI)

Results: >25% improvement in factual reliability on benchmark datasets; benchmarked across 5+ LLM variants on factuality, precision, and ECE calibration metrics.

DeBERTa RoBERTa HuggingFace RAG pgvector FastAPI Gradio SHAP PostgreSQL Continual Learning

Repo


πŸ“¦ More Projects

🌊 FloatChat β€” NL to SQL over Ocean Data

Schema-aware NLP-to-SQL engine over 3.5M+ ARGO float oceanographic records. Integrates geospatial query execution (PostGIS), dynamic parameter binding, and live Plotly visualizations β€” enabling non-technical researchers to explore marine datasets through natural language alone. Decoupled API layer separates NL interpretation from SQL execution.

πŸ’° FinGuard AI β€” Personal Finance Risk & Simulation Engine

Tri-service platform ingesting Indian bank statements from 5 major banks, classifying transactions at 91% accuracy (sentence-transformers + FAISS), and computing a 5-dimension Financial Risk Index (liquidity, debt burden, volatility, savings stability, investment diversification). Scenario engine lets users adjust parameters and see 6–12 month cash flow forecasts.

βš–οΈ Real-Time Multi-Agent Governance (RL)

Event-driven multi-agent system where financial market agents propose actions, negotiate conflicts, and self-enforce governance rules using PPO/SAC from first principles. Models Nash bargaining, meritocratic voting, and emergent equilibrium stability under real market data.

πŸ’“ ECG FPGA Accelerator

Low-power streaming 1D CNN accelerator for real-time ECG arrhythmia detection on FPGA hardware. Hardware-accelerated deep learning for medical AI β€” built for the FPGA Hackathon 2026. Optimized for minimal latency and power budget on embedded inference.


πŸ“œ Certifications & Credentials

πŸ€– AI & Machine Learning

  • Google x Hack2Skill β€” 5-Day AI Agents Intensive Course
  • AICTE Virtual Internship β€” Google AI/ML Program
  • AI Adventures (Google) β€” ML, DL & GenAI Specializations
  • HCL GUVI β€” AI/ML Certification

☁️ Cloud & Infrastructure

  • AWS β€” Cloud Foundations
  • Prodigy InfoTech β€” Generative AI Internship

πŸ’» Programming & Development

  • IIT Bombay β€” Core Java (Spoken Tutorial)
  • Udemy β€” Python, SQL & C++ Bootcamp

πŸ† Hackathons & Workshops

  • Adobe India Hackathon β€” Participant Certificate
  • SKNCOE Hackathon β€” Certificate
  • COEP β€” Video Editing Workshop
  • ARVR Workshop β€” Certificate

🧭 Leadership & Activities

  • Hackathon Director (College-Wide) β€” Sole organizer and SPOC for an institution-wide hackathon spanning 160 teams and 500+ participants; owned all logistics, scheduling, judging coordination, and real-time decision-making end-to-end. Largest technical event run by a single student at the college.

  • Vice Chair, ACM Student Chapter β€” Directed 6+ technical workshops on ML systems, LLMs, and competitive programming reaching 1,000+ students; represented the chapter at 10+ external hackathons with consistent top-5/10 finishes; coordinated inter-college hackathon partnerships and peer research initiatives.

  • Sponsorship Lead, MPulse Technical Fest β€” Closed 30+ sponsors and raised β‚Ή1 lakh+ through end-to-end acquisition (cold outreach, pitch decks, negotiations); largely funded the college technical fest with 500+ attendees.

  • AI & Computer Vision Lead, Team Vulcans Robotics β€” Designed the computer vision pipeline for ABU Robocon 2026 from scratch; deployed real-time object detection and localization systems on embedded hardware; mentored 4 junior contributors on model integration, quantization, deployment workflows, and code review; represented the team at state-level robotics competitions.

  • Team Lead, SIH Β· SKNCOE Fusion 2025 Β· PCCOE IGC Hackathons β€” Directed multi-disciplinary teams delivering full-stack AI prototypes under tight deadlines; consistent top finishes across national and college-level competitions.


πŸ“Š GitHub Activity


🀝 Let's Connect

Open to AI/ML internships, SWE/MLE roles, and research collaborations.

I build at the intersection of causal AI, LLM reliability, and multi-agent systems β€”
if you're working on something in that space, or just want to talk about a hard problem, reach out.

Β 


Pinned Loading

  1. FloatChat FloatChat Public

    An intuitive AI chatbot that allows researchers, students, and enthusiasts to ask complex questions about ocean data (temperature, salinity, etc.) and receive instant insights and visualizations fr…

    TypeScript 1 1

  2. ClimateX ClimateX Public

    ClimateX: A Causal Policy Engine to Measure the Real-World Impact of Global Climate Policies

    Jupyter Notebook 1 1

  3. Advocai Advocai Public

    A Production-Ready Multi-Agent Framework for Medical, Regulatory & Legal Reasoning

    Python 2

  4. AI-Hallucination-Detection-Application AI-Hallucination-Detection-Application Public

    A full-stack application to detect and verify AI-generated text against a local knowledge base. Built with Python, FastAPI, and ChromaDB.

    1

  5. ecg_fpga_accelerator ecg_fpga_accelerator Public

    Low-Power Streaming 1D CNN Accelerator for Real-Time ECG Arrhythmia Detection | FPGA Hackathon 2026 | Hardware-accelerated deep learning for medical AI

    Python 1

  6. Real-Time-Multi-Agent-Governance Real-Time-Multi-Agent-Governance Public

    A real-time, event-driven multi-agent governance engine for financial markets. Agents analyze live market data, propose actions, negotiate conflicts, and execute decisions through a governance laye…

    Python 1