Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Danselem/gemini_llm_app

Repository files navigation

🔮 Google Gemini LLM App

👤 Author


A demonstration project for building LLM applications using Google's Gemini models, with LangChain, ChromaDB, and Retrieval-Augmented Generation (RAG). This app shows how to load documents, embed them, store in a vector DB, and answer queries based on context.

Update: Tracing has been integrated into the project with Arize Open Inference Telemetry to log traces and spans.


📦 Features

  • ✅ Integration with Google Gemini (e.g., gemini-1.5-flash)
  • 📄 PDF ingestion and preprocessing
  • ✂️ Intelligent document chunking with LangChain
  • 🧠 Vector embeddings using Gemini
  • 🗃️ ChromaDB as the vector store
  • 🔎 Semantic search for relevant document chunks
  • 💬 Question-answering using a RAG pipeline
  • 📈 Tracing and observability with Arize Open Inference for end-to-end span-level logging. Phoenix Open Inference has been persisted with PostgreSQL and has been containerized with Docker. See the docker-compose file.
  • 🧪 Includes example usage in the examples directory

Note: Sentence transformer embeddings has been integrated into the project, see the embedding module.


📁 Project Structure

gemini_llm_app/
.
├── docker-compose.yaml
├── Dockerfile
├── LICENSE
├── Makefile
├── pyproject.toml
├── README.md
├── requirements.txt
├── src
│   ├── agent
│   │   └── tools.py
│   ├── embeddings
│   │   ├── gemini_embedding.py
│   │   └── sentence_embedding.py
│   ├── handlers
│   │   ├── __init__.py
│   │   └── error_handler.py
│   ├── ingest.py
│   ├── llm
│   │   ├── gemini_client.py
│   │   └── lang_gemini.py
│   ├── observability
│   │   └── arize_observability.py
│   ├── prompt_engineering
│   │   ├── __init__.py
│   │   ├── prompt.py
│   │   └── templates.py
│   ├── rag
│   │   ├── app.py
│   │   ├── apphybrid.py
│   │   ├── hybrid.py
│   │   └── lcel.py
│   ├── retrievers
│   │   └── retriever.py
│   ├── utils
│   │   ├── doc_loader.py
│   │   ├── doc_split.py
│   │   ├── download_file.py
│   │   ├── logger.py
│   │   ├── rate_limiter.py
│   │   └── setvars.py
│   └── vectors
│       └── chroma_vector.py
├── tests
│   ├── test_chroma_vector_pdf.py
│   ├── test_ingest_pdf.py
│   └── test_ingest.py
└── uv.lock

🚀 Quick Start

1. Clone the Repository

git clone https://github.com/Danselem/gemini_llm_app.git
cd gemini_llm_app

2. Create and Activate a Virtual Environment

This project uses the uv project manager and make tools.

uv venv --python 3.11
source .venv/bin/activate

3. Install Dependencies

uv pip install --all-extras --requirement backend/pyproject.toml

or

make install

🔐 Environment Variables

Create a .env file at the project root and fill the environment variables with make env.

GOOGLE_API_KEY=your_gemini_api_key
MULTIMODAL_MODEL=gemini-1.5-flash

📈 Start the Telemetry Server

This project uses the Arize Open Inference for telemetry and logging of traces and spans. To start the telemetry server, run the command below.

make start-phoenix

Note, ensure your Docker software is running before running the above command.


▶️ Run the App

There are multiple examples in the examples directory for you to get started with, e.g.:

python -m examples.psumm

or

make psumm

To run the RAG app, use the command below:

make app

The example code summarises a pdf file. There are multiple examples in the examples directory. You can also check out the Makefile to see other examples.


⚙️ How It Works

1. Download & Parse PDF

The PDF is downloaded from a URL if it's not already in data/pdfs.

2. Text Splitting

The PDF is split into overlapping chunks using RecursiveCharacterTextSplitter from LangChain.

3. Embeddings

Each chunk is converted into a vector using Google Gemini Embedding or Sentence Transformer Embedding (via LangChain wrapper).

4. Vector Store

The chunks and their embeddings are stored in a local ChromaDB collection.

5. Semantic Search

When a user asks a question, the app retrieves relevant chunks using vector similarity search.

6. RAG Response

The relevant chunks are passed to Gemini to answer the question contextually.


📚 Tech Stack

Component Role Notes
LangChain Orchestrates prompts, chains, and retrievers Backbone for LLM workflows
ChromaDB Vector store for embeddings Local persistence for RAG
Google Gemini API Primary LLM / embedding provider Configurable model (e.g., gemini-1.5-flash)
sentence-transformers Fallback / local embeddings all-MiniLM-L6-v2 used via SentenceTransformer
Open Inference (Arize / Phoenix) Telemetry & tracing Observability for traces/spans
Redis (RedisSaver) Short-term memory checkpointing Used by langgraph checkpointing
Docker / docker-compose Containerization & local telemetry stack Phoenix / Postgres services
Python 3.11+ Runtime Project tested on Python 3.11

📄 License

MIT License. See the LICENSE file.


🙏 Acknowledgements

  • Google for releasing the Gemini family of models

  • LangChain community for open-source tools

  • ChromaDB team for fast and easy vector storage


💡 Ideas for Extension

  • 🔧 Add a simple web UI with Gradio or Streamlit for the RAG application.

  • 📝 Ingest multiple documents and support multi-source QA.

  • 🧠 Add caching to avoid re-embedding on re-runs.

  • 📊 Integrate telemetry/tracing and observability with `Open Inference and Phoenix.

About

A Gemini LLM Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published