LLM Semantic Cache Demo

A simple demonstration of using Redis Vector Database for semantic caching of LLM responses, creating a more efficient and cost-effective approach to working with language models.

Project Structure

llm-semantic-cache/
├── .env                  # Environment variables (API keys)
├── .gitignore            # Git ignore file
├── main.py               # Main application code
├── requirements.txt      # Python dependencies
└── README.md             # Project documentation

Prerequisites

Before running this application, you need:

Python 3.8 or higher
An OpenAI API key
Docker (for running Redis Stack)

Installation

Clone this repository:

git clone https://github.com/yourusername/llm-semantic-cache.git
cd llm-semantic-cache

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Set up your environment variables:

Copy the .env.example file to .env and add your OpenAI API key:

cp .env.example .env
# Then edit .env and add your API key

Start Redis Stack with Docker:

docker run -d --name redis -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

Usage

Run the main script:

python main.py

This will:

Try to find a semantically similar cached answer for "What is the capital of France?"
If not found, query the OpenAI API
Store the result in the semantic cache

How Semantic Caching Works

Traditional caching systems rely on exact key matches. Semantic caching instead:

Converts user prompts to vector embeddings
Checks if similar questions (by vector distance) were already asked
Returns cached responses for semantically similar questions
Only calls the LLM API when truly novel questions are asked

Benefits:

Reduced API costs
Lower latency for repeated or similar queries
Consistent responses for similar questions

Customization

You can modify the following parameters in main.py:

distance_threshold: Lower values require closer semantic matches
ttl: How long cached entries remain valid (in seconds)
Change the embedding model or LLM model as needed

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Semantic Cache Demo

Project Structure

Prerequisites

Installation

Usage

How Semantic Caching Works

Customization

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Fatma-elzahra-Mo/LLM-Caching

Folders and files

Latest commit

History

Repository files navigation

LLM Semantic Cache Demo

Project Structure

Prerequisites

Installation

Usage

How Semantic Caching Works

Customization

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages