Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View Pranav-Chaudhari07's full-sized avatar
🎯
Focusing
🎯
Focusing
  • R.C.Patel Institute of Technology, Shirpur
  • Shirpur,Maharashtra,India
  • 22:53 (UTC -12:00)
  • Codestin Search App in/pranavchaudhari07

Highlights

  • Pro

Block or report Pranav-Chaudhari07

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Pranav-Chaudhari07/README.md

LinkedIn Gmail GitHub


Second-year B.Tech CSE (Data Science) student at R.C. Patel Institute of Technology, Shirpur. My focus is applied ML and AI systems — building pipelines that work under real-world constraints, not just on clean benchmarks. I want to understand where models fail, how to connect LLMs to actual data, and what it takes to ship inference APIs that hold up in production.

Interned at Vault of Code twice — first as a Software Development Intern building production web interfaces, then as an AI & Prompt Engineering Intern designing LLM pipelines that reduced generation latency by 15%.


Projects

Python TensorFlow Keras OpenCV Flask

Document authenticity verification system using deep learning. Achieves 84%+ classification accuracy on 150+ document images with a live Flask REST API serving real-time predictions.

Architecture:

  • EfficientNetB0 transfer learning (fine-tuned from ImageNet weights) chosen after benchmarking against simpler CNN baselines on accuracy vs. inference speed tradeoff
  • OpenCV preprocessing pipeline: noise removal, adaptive thresholding, and augmentation applied before model input to improve robustness across document quality variations
  • Flask REST API with a responsive web interface — returns prediction label and confidence score on each request

Three decisions I had to think through:

How deep to fine-tune — Freezing all EfficientNetB0 layers gave lower accuracy; fine-tuning the top layers pushed it to 84%+. The risk was overfitting on a small dataset (150+ images), managed with dropout and data augmentation on the minority fraud class.

Preprocessing as a first-class concern — Early runs showed high variance on low-quality scans. Adding adaptive thresholding and noise reduction in OpenCV before inference stabilised predictions significantly. The preprocessing pipeline ended up being as important as the model architecture.

API design for live inference — Chose Flask over a heavier framework to keep the serving layer lightweight. The REST API accepts an image upload, runs the full preprocessing + inference pipeline, and returns structured JSON with label and confidence — usable directly from a frontend or another service.


Python Streamlit Groq DuckDB Pandas

Natural language analytics platform — upload a CSV or Excel file, ask questions in plain English, get SQL results and visualisations back. No SQL knowledge required.

Architecture:

  • Groq API (Llama 3.3 70B) translates NLP prompts into optimised SQL queries in real time — the LLM acts as a query compiler, not a chatbot
  • DuckDB runs the generated SQL directly on the in-memory dataframe — no external database needed, low latency, works entirely on the uploaded file
  • Streamlit frontend handles file upload, chat interface, and dynamic chart rendering in a single script

What made this interesting:

LLM as a query layer — The core design decision was treating the LLM as a SQL translator rather than a general assistant. This keeps outputs structured and auditable: every answer traces back to a SQL query the user can inspect.

DuckDB for in-process analytics — Using DuckDB meant I could run analytical SQL on Pandas DataFrames without spinning up a database server. It handles aggregations and joins on uploaded files in milliseconds, which matters for a demo-able interactive tool.


Python Flask Scikit-learn MongoDB JavaScript

Personalised learning path recommender that generates career-aligned course sequences from a user's skill profile and interests.

Architecture:

  • Collaborative and content-based filtering via Scikit-learn — hybrid approach handles cold-start (new users with no history) better than either method alone
  • MongoDB stores user skill profiles, interaction history, and course metadata — document structure fits naturally since user profiles are heterogeneous
  • Flask REST API exposes recommendation endpoints consumed by a JavaScript frontend; career roadmap generation is a separate endpoint that chains recommendations into a learning sequence

Experience

AI & Prompt Engineering Intern — Vault of Code (Jan 2025 – Mar 2025)

Designed structured prompt engineering pipelines to automate content generation workflows using LLM APIs. Built reusable prompt templates that improved response accuracy and reduced generation latency by 15%. Integrated AI models into internal tools, cutting manual effort in document processing and data extraction.

Software Development Intern — Vault of Code (Jun 2024 – Aug 2024)

Developed responsive UI components in HTML, CSS, and JavaScript for the EDITKARO.IN production platform. Optimised cross-device layout and visual consistency across 10+ web pages. Collaborated via Git/GitHub for version control and iterative design improvements.


Technical Focus

Domain Stack
Languages Python · Java · C · PHP
ML & AI TensorFlow/Keras · Scikit-learn · OpenCV · NumPy · Pandas
NLP & LLMs Groq API (Llama 3.3 70B) · Prompt Engineering · LLM Pipelines
Backend Flask · REST APIs
Frontend HTML5 · CSS3 · JavaScript · Streamlit
Databases MySQL · MongoDB · SQLite · DuckDB
Tools Git · GitHub · Docker · Postman · VS Code
Concepts Deep Learning · Transfer Learning · NLP · OOP · Data Structures

Currently

  • Extending the document fraud detection system with broader document type support
  • Exploring MLOps fundamentals — model versioning, monitoring, and CI/CD for ML pipelines
  • Preparing for Amazon ML Summer School 2026 — revisiting probability, optimisation, and deep learning fundamentals
  • Contributing to Data Polaris, the AI & Data Science club at RCPIT
  • Looking for ML / Data Science / Data Analyst internship roles (remote preferred)

Achievements

  • 🧩 500+ problems solved on CodeChef — consistent practice on Data Structures & Algorithms
  • 🤖 National Finalist — IIT Indore Robo Soccer Competition
  • ☁️ Google Cloud Arcade Trooper Milestone
  • 🛠️ AWS AI for Bharat Hackathon participant — built a Government Schemes Chatbot

Stats

Stats reflect public repository activity.

  

GitHub Streak


Pinned Loading

  1. AI-Document-Fraud-Detection AI-Document-Fraud-Detection Public

    AI-powered document fraud detection system using EfficientNetB0 (Transfer Learning) & Flask. Classifies documents as Genuine or Fraudulent with 84%+ accuracy. Features real-time predictions, confid…

    HTML 1

  2. groq-data-analyst groq-data-analyst Public

    An AI-powered data analyst dashboard built with Streamlit, Groq (Llama 3.3 70B), and DuckDB to query and analyze CSV/Excel files using natural language.

    Python 1

  3. Pathfinder--Course-Recommendation-System Pathfinder--Course-Recommendation-System Public

    Forked from PratikNikwade/Sem-3-Project-2025

    A course recommendation system that suggests personalized learning paths based on user interests, skills, and career goals.

    HTML

  4. Pranav-Chaudhari07 Pranav-Chaudhari07 Public

  5. Vansh-Ahire/rl-bus-optimization Vansh-Ahire/rl-bus-optimization Public

    An intelligent bus routing system using Deep Reinforcement Learning (DQN) to minimize passenger wait time, optimize fuel usage, and ensure balanced stop coverage, with built-in evaluation and basel…

    Python 2 1