Thanks to visit codestin.com
Credit goes to github.com

Skip to content

WebRAgent is a retrieval-augmented generation (RAG) web application featuring agent-based query decomposition, vector search with Qdrant, and integration with leading LLM providers for context-rich, dynamic responses.

Notifications You must be signed in to change notification settings

dkruyt/WebRAgent

Repository files navigation

πŸ” WebRAgent

A Retrieval-Augmented Generation (RAG) web application built with Flask and Qdrant.

Β© 2024 Dennis Kruyt. All rights reserved.

πŸ“‹ Overview

This application implements a RAG system that combines the power of Large Language Models (LLMs) with a vector database (Qdrant) to provide context-enhanced responses to user queries. It features:

  • πŸ’¬ User query interface for asking questions
  • πŸ” Admin interface for managing document collections
  • πŸ“„ Document processing and embedding
  • πŸ€– Integration with multiple LLM providers (OpenAI, Claude, Ollama)

✨ Features

  • πŸ–₯️ User Interface: Clean, intuitive interface to submit queries and receive LLM responses
  • 🌐 Web Search: Search the web directly using SearXNG integration with LLM result interpretation
  • πŸ€– Agent Search: Break down complex questions into sub-queries for more comprehensive answers
  • 🧠 Mind Maps: Visualize response concepts with automatically generated mind maps
  • πŸ”Ž Vector Search: Retrieve relevant document snippets based on semantic similarity
  • πŸ‘€ Admin Interface: Securely manage collections and upload documents
  • πŸ“ Document Processing: Automatically extract text, chunk, embed, and store documents
  • 🧠 Multiple LLM Support: Configure your preferred LLM provider (OpenAI, Claude, Ollama)
  • πŸ” Dynamic Embedding Models: Automatically detects and uses available embedding models from all configured providers

πŸ“‹ Prerequisites

  • 🐍 Python 3.8+
  • πŸ—„οΈ Qdrant running locally or remotely
  • πŸ”‘ API keys for your chosen LLM provider

πŸš€ Installation

πŸ’» Option 1: Local Installation

  1. Clone the repository:

    git clone https://github.com/dkruyt/WebRAgent.git
    cd WebRAgent
  2. Create and activate a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Copy the example environment file and configure it with your settings:

    cp .env.example .env

    Then edit the .env file with your preferred settings. Here are the key settings to configure:

    # API Keys for LLM Providers (uncomment and add your keys for the providers you want to use)
    # At least one provider should be configured
    #OPENAI_API_KEY=your_openai_api_key_here
    #CLAUDE_API_KEY=your_claude_api_key_here
    
    # Ollama Configuration (uncomment to use Ollama)
    #OLLAMA_HOST=http://localhost:11434
    
    # Qdrant Configuration
    QDRANT_HOST=localhost
    QDRANT_PORT=6333
    
    # SearXNG Configuration
    SEARXNG_URL=http://searxng:8080
    
    # Flask Secret Key (generate a secure random key for production)
    FLASK_SECRET_KEY=change_me_in_production
    
    # Admin User Configuration
    ADMIN_USERNAME=admin
    ADMIN_PASSWORD=change_me_in_production
    

    Note: The system will automatically detect and use models from the providers you've configured:

    • If you set OPENAI_API_KEY, it will use OpenAI models for both LLM and embeddings
    • If you set CLAUDE_API_KEY, it will use Claude models for LLM
    • If you set OLLAMA_HOST, it will use Ollama models for both LLM and embeddings
    • Sentence Transformers will be used as fallback embedding models

    There's no need to manually specify which models to use - the system dynamically detects available models.

  5. Make sure you have Qdrant running locally or specify a remote instance in the .env file.

  6. If using Ollama, make sure it's running locally or specify a remote instance in the .env file.

  7. Start the application:

    python run.py
  8. Access the application at http://localhost:5000

🐳 Option 2: Docker Installation

This project includes Docker and Docker Compose configurations for easy deployment.

  1. Clone the repository:

    git clone https://github.com/dkruyt/WebRAgent.git
    cd WebRAgent
  2. Start the application with Docker Compose:

    docker-compose up -d
  3. The following services will be available:

  4. To shut down the application:

    docker-compose down

πŸ“₯ Pre-downloading Ollama Models

If you want to pre-download the Ollama models before starting the application:

# For main LLM models
ollama pull llama2
ollama pull mistral
ollama pull gemma

# For embedding models
ollama pull nomic-embed-text
ollama pull all-minilm

The system will automatically detect these models if they're available in your Ollama installation.

πŸ“– Usage

πŸ” User Interface

  1. Navigate to the home page
  2. Choose your search method:
    • Document Search: Select a collection from the dropdown
    • Web Search: Toggle the "Web Search" option
  3. Enter your query in the text box
  4. Configure additional options (optional):
    • Generate Mind Map: Toggle to visualize concepts related to your query
    • Agent Search: Enable for complex questions that benefit from being broken down
    • Number of Results: Adjust how many results to retrieve
  5. Submit your query and view the response
  6. Explore source documents or web sources that informed the answer

🌐 Web Search

  1. Toggle the "Web Search" option on the main interface
  2. Enter your query
  3. The system will:
    • Search the web using SearXNG
    • Use an LLM to interpret and synthesize the search results
    • Present a comprehensive answer along with source links

πŸ€– Agent Search

  1. Enable the "Agent Search" checkbox
  2. Choose a strategy:
    • Direct Decomposition: Breaks down your question into targeted sub-queries
    • Informed Decomposition: Gets initial results first, then creates follow-up queries
  3. Submit your query to receive a comprehensive answer synthesized from multiple search operations

πŸ‘€ Admin Interface

  1. Login with admin credentials (default: username admin, password admin123)
  2. Create new collections from the admin dashboard
  3. Upload documents to collections
  4. Documents are automatically processed and made available for retrieval

πŸ› οΈ Technical Implementation

  • 🌐 Flask: Web framework for the application
  • πŸ—„οΈ Qdrant: Vector database for storing and retrieving document embeddings
  • πŸ” SearXNG: Self-hosted search engine for web search capabilities
  • πŸ€– Agent Framework: Custom implementation for query decomposition and synthesis
  • 🧠 Mind Map Generation: Visualization system for query responses
  • πŸ”€ Embedding Models:
    • SentenceTransformers: Local embedding models (always available as fallback)
    • OpenAI Embeddings: High-quality embeddings when API key is configured
    • Ollama Embeddings: Local embedding models when Ollama is configured
  • πŸ”Œ Model Management: Dynamic provider detection and configuration based on available environment variables
  • πŸ” Flask-Login: For admin authentication
  • πŸ“š Python Libraries: For document processing (PyPDF2, BeautifulSoup, etc.)
  • πŸ“„ Docling: Advanced document processing capability for extracting text from various file formats

πŸ“‚ Project Structure

WebRAgent/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ models/             # Data models
β”‚   β”‚   β”œβ”€β”€ chat.py         # Chat session models
β”‚   β”‚   β”œβ”€β”€ collection.py   # Document collection models
β”‚   β”‚   β”œβ”€β”€ document.py     # Document models and metadata
β”‚   β”‚   └── user.py         # User authentication models
β”‚   β”œβ”€β”€ routes/             # Route handlers
β”‚   β”‚   β”œβ”€β”€ admin.py        # Admin interface routes
β”‚   β”‚   β”œβ”€β”€ auth.py         # Authentication routes
β”‚   β”‚   β”œβ”€β”€ chat.py         # Chat interface routes
β”‚   β”‚   └── main.py         # Main application routes
β”‚   β”œβ”€β”€ services/           # Business logic
β”‚   β”‚   β”œβ”€β”€ agent_search_service.py     # Query decomposition and agent search
β”‚   β”‚   β”œβ”€β”€ chat_service.py             # Chat session management
β”‚   β”‚   β”œβ”€β”€ claude_service.py           # Anthropic Claude integration
β”‚   β”‚   β”œβ”€β”€ document_service.py         # Document processing
β”‚   β”‚   β”œβ”€β”€ llm_service.py              # LLM provider abstraction
β”‚   β”‚   β”œβ”€β”€ mindmap_service.py          # Mind map generation
β”‚   β”‚   β”œβ”€β”€ model_service.py            # Dynamic model management
β”‚   β”‚   β”œβ”€β”€ ollama_service.py           # Ollama integration
β”‚   β”‚   β”œβ”€β”€ openai_service.py           # OpenAI integration
β”‚   β”‚   β”œβ”€β”€ qdrant_service.py           # Vector database operations
β”‚   β”‚   β”œβ”€β”€ rag_service.py              # Core RAG functionality
β”‚   β”‚   β”œβ”€β”€ searxng_service.py          # Web search integration
β”‚   β”‚   └── web_search_agent_service.py # Web search with agent capabilities
β”‚   β”œβ”€β”€ static/             # CSS, JS, and other static files
β”‚   β”œβ”€β”€ templates/          # Jinja2 templates
β”‚   └── __init__.py         # Flask application factory
β”œβ”€β”€ data/                   # Created at runtime for data storage
β”‚   β”œβ”€β”€ collections/        # Collection metadata storage
β”‚   β”œβ”€β”€ documents/          # Document metadata storage
β”‚   β”œβ”€β”€ models/             # Model configuration storage
β”‚   β”‚   β”œβ”€β”€ config.json     # Dynamic model configuration
β”‚   β”‚   └── dimensions.json # Embedding model dimensions
β”‚   └── uploads/            # Uploaded document files
β”œβ”€β”€ searxng/                # SearXNG configuration
β”œβ”€β”€ .dockerignore           # Files to exclude from Docker build
β”œβ”€β”€ .env                    # Environment variables
β”œβ”€β”€ .env.example            # Example environment file
β”œβ”€β”€ .gitignore              # Git ignore patterns
β”œβ”€β”€ docker-compose.yml      # Docker Compose config
β”œβ”€β”€ docker-compose.gpu.yml  # Docker Compose config with GPU support
β”œβ”€β”€ Dockerfile              # Docker build instructions
β”œβ”€β”€ requirements.txt        # Project dependencies
β”œβ”€β”€ README.md               # Project documentation
└── run.py                  # Application entry point

πŸ”’ Security Notes

  • ⚠️ This application uses a simple in-memory user store for demo purposes
  • πŸ›‘οΈ In a production environment, use a proper database with password hashing
  • πŸ” Configure HTTPS for secure communication
  • πŸ”‘ Set a strong, unique FLASK_SECRET_KEY
  • 🚫 Do not expose admin routes to the public internet without proper security

πŸ“œ License

MIT

πŸ“ Copyright

Β© 2024 Dennis Kruyt. All rights reserved.

πŸ™ Acknowledgements

About

WebRAgent is a retrieval-augmented generation (RAG) web application featuring agent-based query decomposition, vector search with Qdrant, and integration with leading LLM providers for context-rich, dynamic responses.

Resources

Stars

Watchers

Forks

Packages