SubgraphRAG+

Production-Ready Knowledge Graph Question Answering with Hybrid Retrieval

🚀 Quick Start • 📖 Documentation • 🏗️ Architecture • 🤝 Contributing • 🎯 Demo

🌟 Overview

SubgraphRAG+ is an advanced knowledge graph-powered question answering system that combines structured graph traversal with semantic vector search. It provides contextual answers with real-time visualizations through a production-ready REST API.

✨ Key Features

🔀 Hybrid Retrieval: Combines Neo4j graph traversal with FAISS vector search
🔄 Real-time Ingestion: Dynamic knowledge graph updates with validation
📡 Streaming API: Server-sent events with live citations and graph data
📊 Interactive Visualization: D3.js-compatible graph data with relevance scoring
🧠 Multi-LLM Support: OpenAI, HuggingFace, Anthropic, MLX (Apple Silicon)
⚡ High Performance: Optimized with caching, indexing, and MLP scoring
🏢 Production Ready: Docker deployment, monitoring, health checks
🎯 Easy Demo: One-command setup with progress indicators

🏗️ Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   FastAPI       │    │  Hybrid         │    │  Knowledge      │
│   REST API      │───▶│  Retriever      │───▶│  Graph (Neo4j)  │
│                 │    │                 │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         │                       ▼                       │
         │              ┌─────────────────┐              │
         │              │  Vector Index   │              │
         ▼              │  (FAISS)        │              ▼
┌─────────────────┐    └─────────────────┘    ┌─────────────────┐
│   LLM Backend   │                           │  MLP Scoring    │
│  (OpenAI/HF/MLX)│                           │  Model          │
└─────────────────┘                           └─────────────────┘

🚀 Quick Start

1. Environment Setup

# Clone the repository
git clone <repository-url>
cd SubGraphPlus

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Configuration

SubgraphRAG+ uses a clean two-tier configuration system that separates secrets from application settings:

.env: Secrets and environment-specific values (never commit to git)
config/config.json: Application settings, models, and parameters (version controlled)

This separation follows security best practices and makes deployment across environments simple.

Quick Setup

# Copy the example and customize with your secrets
cp .env.example .env
nano .env  # Add your actual credentials and API keys

Application Configuration (`config/config.json`)

The main configuration file controls all application behavior:

{
  "models": {
    "backend": "mlx",
    "llm": {
      "mlx": {
        "model": "mlx-community/Qwen3-14B-8bit",
        "max_tokens": 512,
        "temperature": 0.1
      }
    },
    "embeddings": {
      "model": "Alibaba-NLP/gte-large-en-v1.5",
      "backend": "transformers"
    }
  },
  "retrieval": {
    "token_budget": 4000,
    "max_dde_hops": 2,
    "similarity_threshold": 0.7
  }
}

Environment Variables (`.env`)

Contains only secrets and environment-specific values:

# === Database Credentials ===
NEO4J_URI=neo4j://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_secure_password

# === API Security ===
API_KEY_SECRET=your_secret_key

# === API Keys ===
OPENAI_API_KEY=sk-your-key  # Required for OpenAI backend
HF_TOKEN=hf_your-token      # Optional for private HF models

# === Environment ===
ENVIRONMENT=development
LOG_LEVEL=INFO
DEBUG=false

Key Principles:

Secrets in .env - Never commit credentials to version control
Settings in config.json - Application configuration is version controlled
No duplication - Each setting has one clear location
Environment overrides - .env can override config.json defaults when needed

3. Database Setup

Start Neo4j database:

# Using Docker
docker run \
    --name neo4j \
    -p 7474:7474 -p 7687:7687 \
    -d \
    -e NEO4J_AUTH=neo4j/your_password \
    neo4j:latest

# Or use Neo4j Desktop/AuraDB and update NEO4J_URI in .env

4. Run the Application

# Start the FastAPI server
python -m uvicorn src.app.api:app --reload --host 0.0.0.0 --port 8000

The API will be available at http://localhost:8000

🎯 Demo

Quick Demo (Recommended)

The fastest way to see SubgraphRAG+ in action:

# Clone the repository
git clone https://github.com/your-username/SubgraphRAGPlus.git
cd SubgraphRAGPlus

# Setup and run demo (one command!)
./bin/setup_dev.sh --run-demo

# Or run the demo script directly
python examples/demo_quickstart.py --help

Demo Features

📋 Progress Indicators: Clear step-by-step feedback (📋 Step 1/6, ✅ completed)
⚡ Smart Performance: Skips data ingestion if already present
🔧 Flexible Options: --skip-neo4j, --skip-data, custom ports
🎯 Fast Startup: Optimized health checks and server startup
💡 Helpful Errors: Clear guidance when things go wrong

Demo Options

# Full demo with all components
python examples/demo_quickstart.py

# Skip data ingestion if already present
python examples/demo_quickstart.py --skip-data

# Skip Neo4j for CI/testing environments
python examples/demo_quickstart.py --skip-neo4j

# Custom port
python examples/demo_quickstart.py --port 8080

# Minimal demo for quick testing
python examples/demo_quickstart.py --skip-neo4j --skip-data --port 8001

What the Demo Shows

🔧 Environment Setup: Automatic dependency and configuration setup
🗄️ Database Connection: Neo4j connectivity and schema migration
🧠 Model Loading: MLP model detection and validation
📥 Data Ingestion: Sample knowledge graph population (if needed)
🚀 API Server: FastAPI server startup with health checks
🧪 Live Query: Demonstration query with real-time response

The demo provides a complete end-to-end experience in under 2 minutes!

📋 API Documentation

Health & Monitoring

GET /health - Health check
GET /ready - Readiness check (includes model status)
GET /metrics - Prometheus metrics

Core Endpoints

POST /query - Ask questions using RAG
POST /ingest - Add documents to knowledge base
POST /feedback - Provide feedback on responses
GET /graph/browse - Browse knowledge graph

Authentication

All endpoints (except health/metrics) require API key authentication:

curl -H "X-API-Key: your_api_key" http://localhost:8000/query

🎨 Frontend Interface

SubgraphRAG+ includes a modern Next.js frontend with shadcn/ui components for interactive knowledge graph exploration and chat-based querying.

Features

💬 Interactive Chat Interface: Real-time conversation with the knowledge graph
📊 Knowledge Graph Visualization: Interactive D3.js-powered graph exploration
📈 Analytics Dashboard: Query performance and system metrics
🔍 Document Management: Upload and manage knowledge base documents
⚡ Real-time Updates: Server-sent events for live response streaming
🎨 Modern UI: Built with Next.js 15, React 19, and Tailwind CSS

Quick Start

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Create environment configuration
cp .env.local.example .env.local

# Configure API endpoint
echo "NEXT_PUBLIC_API_URL=http://localhost:8000" >> .env.local
echo "NEXT_PUBLIC_API_KEY=your_api_key" >> .env.local

# Start development server
npm run dev

The frontend will be available at http://localhost:3000

Frontend Architecture

frontend/
├── src/
│   ├── app/                 # Next.js App Router pages
│   ├── components/          # Reusable UI components
│   │   ├── ui/             # shadcn/ui base components
│   │   ├── chat-support.tsx # Chat interface
│   │   ├── data-table.tsx  # Knowledge graph browser
│   │   └── ...
│   ├── hooks/              # Custom React hooks
│   ├── lib/                # Utility functions
│   └── util/               # Helper utilities
├── public/                 # Static assets
└── package.json           # Dependencies and scripts

Key Components

Chat Interface (chat-support.tsx): Real-time chat with SSE streaming
Graph Visualization (data-table.tsx): Interactive knowledge graph browser
Navigation (app-sidebar.tsx): Application navigation and user management
Analytics (chart-area-interactive.tsx): Performance metrics and insights

Development

# Install dependencies
npm install

# Run development server with hot reload
npm run dev

# Build for production
npm run build

# Start production server
npm start

# Run linting
npm run lint

Environment Configuration

Create .env.local with your API configuration:

# API Configuration
NEXT_PUBLIC_API_URL=http://localhost:8000
NEXT_PUBLIC_API_KEY=your_api_key

# Optional: Analytics and monitoring
NEXT_PUBLIC_ANALYTICS_ID=your_analytics_id

Deployment

The frontend can be deployed to any platform supporting Next.js:

# Build for production
npm run build

# Deploy to Vercel (recommended)
npx vercel

# Or deploy to other platforms
npm start  # Runs production server on port 3000

⚙️ Configuration

SubgraphRAG+ uses a clean two-tier configuration system that separates secrets from application settings:

.env: Secrets and environment-specific values (never commit to git)
config/config.json: Application settings, models, and parameters (version controlled)

This separation follows security best practices and makes deployment across environments simple.

Quick Setup

# Copy the example and customize with your secrets
cp .env.example .env
nano .env  # Add your actual credentials and API keys

Application Configuration (`config/config.json`)

The main configuration file controls all application behavior:

{
  "models": {
    "backend": "mlx",
    "llm": {
      "mlx": {
        "model": "mlx-community/Qwen3-14B-8bit",
        "max_tokens": 512,
        "temperature": 0.1
      }
    },
    "embeddings": {
      "model": "Alibaba-NLP/gte-large-en-v1.5",
      "backend": "transformers"
    }
  },
  "retrieval": {
    "token_budget": 4000,
    "max_dde_hops": 2,
    "similarity_threshold": 0.7
  }
}

Environment Variables (`.env`)

Contains only secrets and environment-specific values:

# === Database Credentials ===
NEO4J_URI=neo4j://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_secure_password

# === API Security ===
API_KEY_SECRET=your_secret_key

# === API Keys ===
OPENAI_API_KEY=sk-your-key  # Required for OpenAI backend
HF_TOKEN=hf_your-token      # Optional for private HF models

# === Environment ===
ENVIRONMENT=development
LOG_LEVEL=INFO
DEBUG=false

Key Principles:

Secrets in .env - Never commit credentials to version control
Settings in config.json - Application configuration is version controlled
No duplication - Each setting has one clear location
Environment overrides - .env can override config.json defaults when needed

🧪 Testing

SubgraphRAG+ features a high-performance test suite with 99.9% faster execution through optimized model loading and comprehensive mocking.

Quick Testing Commands

# Fast tests (recommended for development) - ~0.16s
make test-fast

# Full test suite with verbose logging
make test-verbose

# Standard test run with optimizations
make test

# Single test with debugging
TESTING=1 LOG_LEVEL=DEBUG python -m pytest tests/test_api.py::TestHealthEndpoints::test_readiness_check_success -v -s

Performance Improvements

Test Type	Before	After	Improvement
Fast Tests	~5+ minutes	~0.16s	99.9% faster
Individual API Test	~30+ seconds	~0.5s	98% faster
Test Collection	Segmentation fault	Instant	Fixed

Environment Variables

TESTING=1: Enables testing mode with model loading disabled
DISABLE_MODELS=1: Explicitly disables all model loading
LOG_LEVEL=DEBUG: Enables detailed logging for debugging
FAST_TEST_MODE=1: Optimizes for fastest possible test execution

Test Categories

# API endpoint tests
python -m pytest tests/test_api.py -v

# Core functionality tests  
python -m pytest tests/test_basic.py -v

# ML model tests (with mocking)
python -m pytest tests/test_mlp_model.py -v

# Embedding tests
python -m pytest tests/test_embedder.py -v

# Run with coverage
python -m pytest --cov=src tests/

See Testing Improvements Guide for detailed technical information about the performance optimizations.

🏗️ Architecture

Core Components

API Layer (src/app/api.py): FastAPI application with endpoints
Configuration (src/app/config.py): Centralized configuration management
Database (src/app/database.py): Neo4j and SQLite database interfaces
ML Models (src/app/ml/): LLM and embedding model abstractions
Retrieval (src/app/retriever.py): RAG retrieval logic
Utils (src/app/utils.py): Shared utilities

Data Flow

Ingestion: Documents → Embeddings → Neo4j Graph + Vector Index
Query: Question → Embedding → Graph Retrieval → LLM → Response
Feedback: User feedback → SQLite → Model improvement

🔧 Development

Adding New LLM Backends

Create a new class in src/app/ml/llm.py implementing the LLMInterface
Add backend configuration to config/config.json
Update the factory function in get_llm_model()

Adding New Embedding Backends

Create a new class in src/app/ml/embedder.py implementing the EmbedderInterface
Add backend configuration to config/config.json
Update the factory function in get_embedder()

Configuration Schema

The configuration system supports:

Type validation: Automatic type checking and conversion
Environment overrides: Override any config value via environment variables
Nested configurations: Hierarchical settings with dot notation
Default values: Fallback values for optional settings

📊 Monitoring

Prometheus Metrics

Available at /metrics:

HTTP request metrics
Response times
Error rates
Custom application metrics

Logging

Structured logging with configurable levels:

LOG_LEVEL=DEBUG  # DEBUG, INFO, WARNING, ERROR
LOG_FILE=logs/app.log  # Optional file output

🚀 Deployment

Docker

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 8000

CMD ["uvicorn", "src.app.api:app", "--host", "0.0.0.0", "--port", "8000"]

Environment-Specific Configs

Create different config files for each environment:

config/config.json (default)
config/config.production.json
config/config.staging.json

Set CONFIG_FILE environment variable to override:

CONFIG_FILE=config/config.production.json python -m uvicorn src.app.api:app

🔒 Security

API Key Authentication: All endpoints protected
Input Validation: Pydantic models for request validation
Rate Limiting: Built-in FastAPI rate limiting
CORS: Configurable cross-origin resource sharing

📈 Performance

Optimization Features

Lazy Loading: Models loaded only when needed
Connection Pooling: Efficient database connections
Caching: Response and embedding caching
Apple Silicon: MLX backend for M1/M2/M3 optimization

Benchmarks

Cold Start: ~2-3 seconds (with model loading)
Query Response: ~200-500ms (cached embeddings)
Ingestion: ~100-200 docs/minute

🤝 Contributing

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass: TESTING=1 python -m pytest
Submit a pull request

📄 License

[Add your license information here]

🆘 Troubleshooting

Common Issues

Import Errors: Ensure all dependencies are installed and virtual environment is activated

Database Connection: Verify Neo4j is running and credentials in .env are correct

Model Loading: Check model names in config/config.json and ensure sufficient disk space

API Key Issues: Verify API_KEY_SECRET is set and using correct header format

Getting Help

Check the logs: tail -f logs/app.log
Run health checks: curl http://localhost:8000/health
Test configuration: TESTING=1 python -c "from src.app.config import config; print(config)"

📖 Documentation

Document	Description	Audience
📚 Documentation Hub	Complete documentation index	All users
🛠️ Installation Guide	Detailed setup instructions	New users
🏗️ Architecture Guide	System design and components	Developers, Architects
🔧 Development Guide	Contributing and local dev	Contributors
🚀 Deployment Guide	Production deployment	DevOps, SysAdmins
📡 API Reference	Complete API documentation	Integrators
🔧 Configuration	Settings and environment vars	All users
🩺 Troubleshooting	Common issues and solutions	All users

🍎 Apple Silicon Users

For optimized performance on M1/M2/M3 Macs:

See MLX Integration Guide for native Apple Silicon acceleration
Use ./bin/setup_dev.sh which auto-detects and configures MLX

🛠️ System Requirements

Minimum Requirements

OS: Linux, macOS, or Windows with WSL2
Python: 3.9+ (tested up to 3.13)
Memory: 4GB RAM
Storage: 10GB free space
Docker: 20.10+ with Compose v2 (for Docker setup)

Recommended for Production

CPU: 4+ cores
Memory: 8GB+ RAM
Storage: 50GB+ SSD
Network: Stable internet connection for LLM APIs

🚦 API Usage

Basic Query

import requests

# Query with graph visualization
response = requests.post(
    "http://localhost:8000/query",
    headers={"X-API-Key": "your-api-key"},
    json={
        "question": "What is machine learning?",
        "visualize_graph": True,
        "max_context_triples": 50
    }
)

# Stream the response
for line in response.iter_lines():
    if line:
        data = json.loads(line.decode('utf-8'))
        print(f"Type: {data['type']}, Content: {data['content']}")

Health Check

# Basic health check
curl http://localhost:8000/healthz

# Comprehensive readiness check
curl http://localhost:8000/readyz

Graph Browsing

# Browse the knowledge graph
curl "http://localhost:8000/graph/browse?limit=100&search_term=AI" \
  -H "X-API-KEY: your-api-key"

🔧 Development Workflow

Daily Development Commands

# Start development server
make serve                    # or: python src/main.py --reload

# Run tests
make test                     # Run full test suite
make test-smoke              # Quick smoke tests
make test-api                # API integration tests

# Code quality
make lint                     # Check code style
make format                   # Auto-format code

# Database operations
make neo4j-start             # Start Neo4j container
make migrate-schema          # Apply database migrations
make ingest-sample           # Load sample data

Project Structure

SubgraphRAGPlus/
├── 📁 src/                   # Application source code
│   ├── 📄 main.py           # Application entry point
│   └── 📁 app/              # Core application modules
│       ├── 📄 api.py        # FastAPI REST endpoints
│       ├── 📄 retriever.py  # Hybrid retrieval engine
│       ├── 📄 database.py   # Neo4j & SQLite connections
│       └── 📁 ml/           # ML models (LLM, embeddings, MLP)
├── 📁 bin/                  # Setup and utility scripts
├── 📁 scripts/              # Python utilities and tools
├── 📁 tests/                # Comprehensive test suite
├── 📁 docs/                 # Documentation
├── 📁 config/               # Configuration files
├── 📁 deployment/           # Docker and infrastructure
├── 📄 Makefile             # Development commands
└── 📄 requirements.txt     # Python dependencies

🏢 Production Deployment

Docker Production

# Production deployment
cd deployment/
docker-compose -f docker-compose.prod.yml up -d

# Scale API instances
docker-compose -f docker-compose.prod.yml up -d --scale api=3

# Monitor services
docker-compose -f docker-compose.prod.yml logs -f

Environment Configuration

SubgraphRAG+ uses a hybrid configuration approach following security best practices:

.env - Secrets and environment-specific values
config/config.json - Application settings and model configurations

🔒 Environment Variables (.env)

Essential production environment variables for secrets and environment-specific settings:

# === Database Credentials ===
NEO4J_URI=neo4j://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-secure-production-password

# === API Security ===
API_KEY_SECRET=your-secure-api-key

# === API Keys ===
OPENAI_API_KEY=your-openai-api-key  # Required if using OpenAI backend
HF_TOKEN=your-hf-token              # Optional, for private HuggingFace models

# === Environment Settings ===
ENVIRONMENT=production              # development|staging|production
LOG_LEVEL=INFO                      # DEBUG|INFO|WARNING|ERROR|CRITICAL
DEBUG=false

# === Optional: Custom Model Paths ===
# MLX_LLM_MODEL_PATH=/path/to/custom/mlx/model
# HF_MODEL_PATH=/path/to/custom/hf/model

⚙️ Application Configuration (config/config.json)

Model settings and application configuration:

{
  "models": {
    "backend": "mlx",
    "llm": {
      "mlx": {
        "model": "mlx-community/Qwen3-14B-8bit",
        "max_tokens": 512,
        "temperature": 0.1,
        "top_p": 0.9
      },
      "openai": {
        "model": "gpt-3.5-turbo",
        "max_tokens": 512,
        "temperature": 0.1,
        "top_p": 0.9
      }
    },
    "embeddings": {
      "model": "Alibaba-NLP/gte-large-en-v1.5",
      "backend": "transformers"
    }
  },
  "retrieval": {
    "token_budget": 4000,
    "max_dde_hops": 2,
    "similarity_threshold": 0.7
  },
  "performance": {
    "cache_size": 1000,
    "api_rate_limit": 60,
    "timeout_seconds": 30
  }
}

🔑 Configuration Best Practices

Never commit secrets: Keep .env in .gitignore
Use environment overrides: Local .env can override config.json defaults
Embedding consistency: Always use transformers backend for embeddings (never MLX)
Backend separation: MLX for LLM only, transformers for embeddings only

🍎 Apple Silicon (MLX) Configuration

For optimal performance on M1/M2/M3 Macs:

# In .env
LOG_LEVEL=DEBUG  # To see MLX initialization logs

// In config/config.json
{
  "models": {
    "backend": "mlx",
    "llm": {
      "mlx": {
        "model": "mlx-community/Qwen3-14B-8bit",
        "max_tokens": 1024,
        "temperature": 0.1
      }
    },
    "embeddings": {
      "model": "Alibaba-NLP/gte-large-en-v1.5",
      "backend": "transformers"
    }
  }
}

Monitoring Endpoints

Health Check: GET /healthz - Basic liveness probe
Readiness Check: GET /readyz - Dependency health with detailed status
Metrics: GET /metrics - Prometheus-compatible metrics
API Docs: GET /docs - Interactive OpenAPI documentation
Neo4j Browser: http://localhost:7474 - Database management interface

🧪 Testing

Running Tests

# Full test suite
make test

# Specific test categories
python -m pytest tests/test_api.py -v          # API tests
python -m pytest tests/test_retriever.py -v    # Retrieval tests
python -m pytest tests/test_mlp_model.py -v    # MLP model tests

# With coverage report
python -m pytest --cov=src tests/ --cov-report=html

Test Structure

Unit Tests: Individual component testing
Integration Tests: Multi-component workflows
API Tests: REST endpoint validation
Smoke Tests: Basic system functionality
Performance Tests: Benchmarking and load testing

🤝 Contributing

We welcome contributions! Here's how to get started:

Quick Contribution Setup

# 1. Fork and clone
git clone https://github.com/your-username/SubgraphRAGPlus.git
cd SubgraphRAGPlus

# 2. Setup development environment
./bin/setup_dev.sh --run-tests

# 3. Create a feature branch
git checkout -b feature/your-feature-name

# 4. Make changes and test
make test
make lint

# 5. Submit a pull request

Development Guidelines

Code Style: Follow PEP 8 with Black formatting
Testing: Add tests for new features
Documentation: Update docs for user-facing changes
Commits: Use conventional commit messages

See Contributing Guide for detailed information.

📜 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🆘 Support & Community

🐛 Bug Reports: GitHub Issues
💡 Feature Requests: GitHub Discussions
📖 Documentation Issues: Create an Issue
💬 General Questions: GitHub Discussions

🙏 Acknowledgments

Original SubgraphRAG research by Microsoft Research
Neo4j and FAISS communities for graph and vector database technologies
FastAPI, PyTorch, and Python ecosystem contributors
Contributors and users of this project

⭐ Star this repository if you find it useful!

Made with ❤️ for the Knowledge Graph community

🚀 Key Improvements Over Original SubgraphRAG

1. Production-Grade Information Extraction

REBEL IE Service: Uses Babelscape/rebel-large for proper triple extraction from raw text
Schema-Driven Entity Typing: Replaces naive string heuristics with authoritative type mappings
Domain Adaptability: Works with Biblical text, legal documents, scientific papers, etc.
Offline Operation: No external API dependencies, fully self-contained

Key Distinction:

REBEL: Extracts relations (Jesus → place of birth → Bethlehem)
Entity Typing: Classifies entity types (Jesus:Person, Bethlehem:Location)
Combined: (Jesus:Person) --[place of birth]--> (Bethlehem:Location)

2. Dynamic Knowledge Graph Construction

Live Ingestion Pipeline: Build KGs from any text corpus in real-time
Incremental Updates: Add new content without rebuilding entire graph
Quality Control: Deduplication, validation, and error handling

3. Enhanced Retrieval & Reasoning

Hybrid Retrieval: Combines graph traversal with dense vector search
MLP-Based Scoring: Uses original SubgraphRAG MLP (no retraining needed)
Budget-Aware Assembly: Optimizes subgraph size for LLM context windows

4. Enterprise-Ready Architecture

Microservices: Containerized IE service, API layer, database components
Monitoring: Health checks, metrics, logging, alerting
Scalability: Horizontal scaling, caching, batch processing

📖 Quick Start with Biblical Text

# 1. Start the unified API (includes IE functionality)
uvicorn src.app.api:app --host 0.0.0.0 --port 8000

# 2. Ingest Biblical text (IE is integrated)
python scripts/ingest_with_ie.py data/genesis.txt --api-key your-api-key

# 3. Process staged triples
python scripts/ingest_worker.py --process-all

# 4. Query the knowledge graph
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your-api-key" \
  -d '{"question": "Who parted the Red Sea?"}'

The unified system will:

Extract triples using integrated REBEL: (Moses, parted, Red Sea)
Type entities using schema: Moses → Person, Red Sea → Location
Build knowledge graph with proper relationships
Answer queries with precise citations and subgraph evidence

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.github/workflows		.github/workflows
bin		bin
config		config
data/sample_data		data/sample_data
docs		docs
evaluation		evaluation
examples		examples
experiments		experiments
frontend		frontend
migrations/neo4j		migrations/neo4j
models		models
notebooks		notebooks
prompts		prompts
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile.ie		Dockerfile.ie
Dockerfile.ner		Dockerfile.ner
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

SubgraphRAG+

🌟 Overview

✨ Key Features

🏗️ Architecture

🚀 Quick Start

1. Environment Setup

2. Configuration

Quick Setup

Application Configuration (config/config.json)

Environment Variables (.env)

3. Database Setup

4. Run the Application

🎯 Demo

Quick Demo (Recommended)

Demo Features

Demo Options

What the Demo Shows

📋 API Documentation

Health & Monitoring

Core Endpoints

Authentication

🎨 Frontend Interface

Features

Quick Start

Frontend Architecture

Key Components

Development

Environment Configuration

Deployment

⚙️ Configuration

Quick Setup

Application Configuration (config/config.json)

Environment Variables (.env)

🧪 Testing

Quick Testing Commands

Performance Improvements

Environment Variables

Test Categories

🏗️ Architecture

Core Components

Data Flow

🔧 Development

Adding New LLM Backends

Adding New Embedding Backends

Configuration Schema

📊 Monitoring

Prometheus Metrics

Logging

🚀 Deployment

Docker

Environment-Specific Configs

🔒 Security

📈 Performance

Optimization Features

Benchmarks

🤝 Contributing

📄 License

🆘 Troubleshooting

Common Issues

Getting Help

📖 Documentation

🍎 Apple Silicon Users

🛠️ System Requirements

Minimum Requirements

Recommended for Production

🚦 API Usage

Basic Query

Health Check

Graph Browsing

🔧 Development Workflow

Daily Development Commands

Project Structure

🏢 Production Deployment

Docker Production

Environment Configuration

🔒 Environment Variables (.env)

Application Configuration (`config/config.json`)

Environment Variables (`.env`)

Application Configuration (`config/config.json`)

Environment Variables (`.env`)

Packages