RAG System for Universidad Nacional de Colombia, Manizales

Overview

This repository contains an interactive Retrieval-Augmented Generation (RAG) system designed to enhance access to information from the Universidad Nacional de Colombia, Manizales campus. Built as a degree project, it leverages an open-source large language model (LLM) integrated with RAG to answer contextual queries (e.g., "What is the profile of an applicant to the Architecture program?") based on institutional website data. The system uses Docker for reproducibility, featuring a modular architecture with Chroma (vector database), Ollama (LLM), FastAPI (backend), and Streamlit (frontend). It aims to democratize information access for students and visitors in a local, lightweight setup.

Features

Data Ingestion: Scrapes and processes data from the university website using Scrapy.
RAG Integration: Combines retrieval with Hermes 3 LLM for accurate responses.
Interactive Interface: User-friendly frontend accessible via localhost:8501.
Reproducibility: Fully containerized with Docker for consistent deployment.

Requirements

Docker (20.10+)
Docker Compose (2.0+)
Python 3.11
Poetry (~1.8.2)

Installation

Clone the repository:

git clone https://github.com/dserranog1/rag
cd rag

Install dependencies:

curl -sSL https://install.python-poetry.org | python3 -
poetry install

Collect data:
```
scrapy crawl degrees
scrapy crawl misc
```
Start the containers:
```
docker compose up --build
```

Populate Chroma:

 docker exec -it rag_agent bash
 python -m rag_agent_man.db -d

Access the system at http://localhost:8501

Usage

Enter questions in the Streamlit interface to retrieve answers. Initial setup (e.g., Ollama model download ~4GB) may take time, but subsequent runs are faster.

Contributing

Contributions are welcome! Please fork the repository, create a feature branch, and submit a pull request. Report issues via the Issues tab.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Special thanks to my thesis advisor and the Universidad Nacional de Colombia for support and resources.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
frontend		frontend
rag_agent_man		rag_agent_man
scraper		scraper
.deepeval		.deepeval
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
docker-compose.yml		docker-compose.yml
model_pull.sh		model_pull.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

RAG System for Universidad Nacional de Colombia, Manizales

Overview

Features

Requirements

Installation

Usage

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

dserranog1/rag

Folders and files

Latest commit

History

Repository files navigation

RAG System for Universidad Nacional de Colombia, Manizales

Overview

Features

Requirements

Installation

Usage

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages