Interview Task
🧪Interview
KollegeApply AI/LLM Engineer Assignment (Post-
Task)
📝 Problem Statement
Build a Mini RAG-based AI Chatbot that recommends colleges based on user
queries.
You need to create a fully functional Python-based backend (CLI or API) that takes
user queries like:
"Best private MBA colleges in Delhi under 10 lakhs fees"
"Colleges accepting 90 percentile in CAT"
"Top engineering colleges in South India with good
placements"
…and returns 3 college suggestions grounded on real or mock data using
retrieval-augmented generation (RAG).
🧩 Requirements
📁 1. Data Preparation
Use or generate a small mock dataset (15–30 entries) of college info with:
name
city
course
fees
avg_package
ranking
Interview Task 1
exam
type (private/govt)
Store it as a CSV or JSON file.
🧠 2. Embedding & Vector Store Setup
Use SentenceTransformer or OpenAI Embeddings to embed college descriptions like:
“FMS Delhi, Public MBA college, ₹2L fees, avg. package ₹25L, accepts CAT.”
Store embeddings in ChromaDB or Pinecone with metadata (city, type, fees,
etc.).
🔍 3. Retrieval + Prompt Injection
User query is embedded and top 3 colleges are retrieved via vector similarity.
Retrieved college descriptions must be injected into a prompt.
Use OpenAI GPT (3.5 or 4) to generate a final response in this format:
markdown
CopyEdit
Based on your preferences, here are the top matches:
1. **FMS Delhi** – ₹2L fees – Avg. Package ₹25L – Public
2. **IMI Delhi** – ₹10L fees – Avg. Package ₹14L – Private
3. **FORE School** – ₹9L fees – Avg. Package ₹13L – Private
🧪 4. Bonus (Optional but Preferred)
Add metadata filters (e.g., fees < ₹10L) before or after vector retrieval.
Return JSON response via FastAPI endpoint.
Add a score or match % logic in final output.
🎯 Evaluation Criteria
Interview Task 2
Criteria Weightage
Correctness (RAG pipeline) 30%
Code structure & clarity 20%
Vector search implementation 15%
Prompt design & final output 15%
Bonus (filters, API, scores) 20%
🚚 Submission Guidelines
Submit via GitHub or zipped folder.
Include a README.md with:
Setup instructions
Sample queries
Explanation of components
Interview Task 3