LLMpedia is a Streamlit application that provides a user-friendly interface to explore and interact with Large Language Model research papers.
- Browse and search LLM research papers from arXiv.
- View paper summaries, illustrations, and key insights generated by AI models.
- Explore topics through interactive visualizations (e.g., UMAP clusters).
- Chat with an AI assistant about LLM research, with configurable models and source citations.
- Access weekly research reviews summarizing key developments.
- Discover related code repositories.
The data displayed in LLMpedia originates primarily from arXiv. Raw paper data is processed, analyzed, and enriched through a separate pipeline managed in the llmpedia_workflows repository. This includes:
- Fetching new papers
- Generating summaries and artwork
- Extracting key information
- Topic modeling and embedding generation
The processed data is then stored in the PostgreSQL database used by this application.
- Clone this repository:
git clone <repository_url> - Navigate to the project directory:
cd llmpedia - Install dependencies:
pip install -r requirements.txt - Copy the environment template:
cp .env.template .env - Configure your environment variables in the
.envfile (see below). - Ensure you have access to the PostgreSQL database populated by the
llmpedia_workflowspipeline. - Run the Streamlit app locally:
streamlit run app.py
This app requires access to a PostgreSQL database containing LLMpedia data. Configure the database connection in the .env file.
The following environment variables are required to run the app:
# Database Configuration
DB_NAME
DB_HOST
DB_PORT
DB_USER
DB_PASSWORD
# LLM API Keys (for chat functionality)
OPENAI_API_KEY
ANTHROPIC_API_KEY
COHERE_API_KEY
HUGGINGFACE_API_KEY
GROQ_API_KEY
# AWS Configuration (for S3 access)
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION