Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View Harsh-4210's full-sized avatar

Highlights

  • Pro

Block or report Harsh-4210

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Harsh-4210/README.md

Applied ML & AI Engineer · Adversarial RL · LLM Fine-Tuning · Production ML Systems

LinkedIn HuggingFace Email Profile views

B.E. Artificial Intelligence & Data Science @ SPPU · GPA 8.75/10

I train RL agents that learn what humans never state, and ship ML systems that survive production.


🧠 Flagship Project — ConflictBench

Business instructions contradict. ConflictBench teaches LLMs to resolve them.

ConflictBench is an RL environment that trains language models to resolve contradictory business directives by discovering an implicit 6-tier authority hierarchy — Legal > C-Suite > VP > Director > Team Lead > IC — entirely from reward signal. The hierarchy is never stated in the prompt; the model discovers it through episodes of 8–28 directives with 2–6 embedded conflict pairs.

┌──────────────────────────────────────────────────────────────────┐
│  Scenario Generator  →  8–28 directives, 2–6 conflict pairs      │
│  Reward Function     →  5-rubric deterministic (no LLM judge)    │
│  Training            →  GRPO + LoRA (r=32) on Qwen2.5-3B         │
│  Hardware            →  Single A100 48GB · 2 epochs · 400 scenes │
│  Output              →  Conflict-free resolution + JSON schema   │
└──────────────────────────────────────────────────────────────────┘
Metric Result
Composite reward lift 0.14 → 0.50 (+257%) over zero-shot baseline
Reward rubrics Correctness · Contradiction-freedom · F1 · Efficiency · Schema
Training GRPO + LoRA (r=32) on Qwen2.5-3B, A100 48GB
Recognition Finalist — Meta × PyTorch × HuggingFace OpenEnv Hackathon, Bangalore

GitHub HuggingFace Demo


🔬 Projects

Two-agent zero-sum adversarial loop for hallucination detection — Red Agent generates plausible silent-failure hallucinations, Blue Agent acts as a factual gatekeeper.

  • Expert Correction Training (ECT): converts failed RL steps into supervised signal, preventing policy collapse
  • Hallucination detection: 25% → 100% with 4% false-alarm rate
  • 96% OOD generalisation across unseen domains
  • Asymmetric rewards (TP+0.6, FP−2.0, FN−0.6) + zero-sum ELO tracking

PPO LoRA PEFT REINFORCE SFT

Production-deployed system for full forward/backward traceability across 6 entity types — raw material lots to customer dispatch orders.

  • 6-role RBAC with Firebase ID-token verification
  • CSV ingestion with full rollback + request-level audit trail
  • Natural-language AI query endpoint for non-technical users
  • Containerised (Bun + FastAPI) → deployed on Render with auto-deploy

FastAPI React Firebase Auth SQLite Docker

Detects metacognitive miscalibration — when a student's confidence diverges from actual performance — and dynamically adjusts learning paths.

  • Bloom's taxonomy difficulty engine
  • Voice-based exam interface via Groq Whisper
  • RAG-powered study mentor (Haystack) + React Flow knowledge graph
  • 🥉 3rd Place — Pragyantra, PES Modern College of Engineering

Next.js 15 FastAPI Groq Whisper Haystack RAG MongoDB Redis

End-to-end ML pipeline predicting SO₂ emissions from Indian coal power plants, deployed as a containerised microservice.

  • 85% accuracy on held-out test data via cross-validation
  • Optuna-based hyperparameter tuning on XGBoost
  • 20% efficiency boost through feature engineering + pipeline automation
  • Comprehensive REST API with structured error handling

XGBoost FastAPI Docker PostgreSQL Optuna


🏆 Hackathons & Awards

🥇 Finalist — Meta × PyTorch × HuggingFace OpenEnv Hackathon, Bangalore ConflictBench
🏅 Top 100 — Scaler School of Technology OpenEnv Pre-Selection ConflictBench
🥉 3rd Place — Pragyantra, PES Modern College of Engineering Arivon

⚙️ Tech Stack

LANGUAGES    = ["Python", "SQL", "JavaScript", "TypeScript"]

ML_RL        = ["PyTorch", "GRPO", "PPO", "LoRA/QLoRA", "TRL", "Unsloth",
                "Ray RLlib", "HuggingFace Transformers", "PEFT"]

VISION       = ["YOLOv8", "ONNX Runtime", "OpenCV", "Albumentations"]

LLM_INFRA    = ["RAG Pipelines", "RLHF", "Adversarial RL", "Agentic AI",
                "Haystack", "Groq Whisper"]

BACKEND      = ["FastAPI", "Next.js 15", "React"]

INFRA_DB     = ["Docker", "GitHub Actions", "Google Cloud",
                "PostgreSQL", "MongoDB", "Redis"]

CERTS        = ["Deep Learning Specialization (Andrew Ng)",
                "Generative AI with LLMs (AWS / Coursera)",
                "LLM Fundamentals (Hugging Face)"]

📊 GitHub Stats


📈 Contribution Activity


🎯 Currently

  • 🔬 Building adversarial RL systems and LLM fine-tuning pipelines
  • 🏗️ Shipping production ML with FastAPI + Docker
  • 📖 B.E. AI & Data Science @ SPPU · Open to ML engineering roles & research internships
  • 📬 Reach me: [email protected] · linkedin.com/in/harsh-jain0621

"I don't just train models — I build systems that ship, scale, and survive production."

Profile views

Pinned Loading

  1. Self_Evolving_Multi_Agent_Governance Self_Evolving_Multi_Agent_Governance Public

    A decentralized multi-agent system for self-evolving governance, negotiation, and conflict resolution in a digital economy.

    TypeScript 1 1

  2. Autostream-Langgraph-agent Autostream-Langgraph-agent Public

    Python

  3. Conflict_Bench Conflict_Bench Public

    Python

  4. LLM_HALLUCINATION_RL LLM_HALLUCINATION_RL Public

    Python

  5. nishtha911/Pragyantra-ED14-ET-3 nishtha911/Pragyantra-ED14-ET-3 Public

    AI Adaptive Learning & Skill Development Platform

    TypeScript 2 2

  6. ruxir-ig/mccia-tracelink ruxir-ig/mccia-tracelink Public

    WIP. Details here: https://x.com/ruchirkalokhe/status/2049884299974041904?s=20

    TypeScript 2