Transform any website's documentation section into an MCP-compatible server using the Python MCP SDK.
AnyDocs MCP Server is a comprehensive solution that turns any website's documentation into an interactive, AI-accessible knowledge base through the Model Context Protocol (MCP). It can scrape, index, and serve documentation from any website - from modern API docs to legacy documentation portals.
- 🌐 Universal Website Scraping: Turn ANY website's documentation into an interactive knowledge base
- 🔌 Universal Adapter System: Support for GitBook, Notion, Confluence, and custom documentation platforms
- 🔍 Advanced Search: Full-text search with SQLite FTS and semantic search capabilities
- 🔐 Robust Authentication: API Key, OAuth2, and JWT-based authentication
- ⚡ High Performance: Async/await architecture with caching and rate limiting
- 🎛️ Web Management Interface: FastAPI-based admin panel for configuration and monitoring
- 📊 Real-time Monitoring: Health checks, metrics, and logging
- 🐳 Docker Ready: Complete containerization with development and production configurations
- 🔄 Auto-sync: Automatic content synchronization with source documentation
- Python 3.11+ (recommended: 3.11 or 3.12)
- SQLite 3.35+ (for FTS5 support)
- Optional: Redis (for caching)
- Optional: PostgreSQL/MySQL (for production)
The easiest way to run AnyDocs MCP Server is using uvx
, which automatically manages dependencies and virtual environments:
# Run directly with uvx (no installation needed)
uvx anydocs-mcp-server
# Run with custom configuration
uvx anydocs-mcp-server --config config.yaml
# Run in debug mode
uvx anydocs-mcp-server --debug
# Install globally with uvx for repeated use
uvx install anydocs-mcp-server
# Then run anytime with:
anydocs-mcp-server --config config.yaml
# Clone the repository
git clone https://github.com/funky1688/anydocs-mcp.git
cd anydocs-mcp
# Install dependencies using uv (recommended)
pip install uv
uv pip install -e .
# Copy environment configuration
cp .env.example .env
# Copy and customize configuration
cp config.yaml my-config.yaml
# Edit configuration (add your API keys and settings)
nano .env
# Start the server (hybrid mode - MCP + Web interface)
uv run python start.py
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
uv pip install -e . # for production
# OR for development:
uv pip install -e .[dev]
# Copy and configure environment
cp .env.example .env
# Edit configuration files
nano .env
# Initialize and start
uv run python start.py --mode hybrid --debug
After installation, you can also run the server as a Python module:
# Run as module
python -m anydocs_mcp
# With configuration
python -m anydocs_mcp --config config.yaml
# Debug mode
python -m anydocs_mcp --debug
# Development environment
docker-compose -f docker-compose.dev.yml up -d
# Production environment
docker-compose up -d
AnyDocs MCP Server supports 3 startup modes:
Starts both MCP server and web management interface simultaneously:
uv run python start.py
# or explicitly:
uv run python start.py --mode hybrid
- MCP Server: Available at
http://localhost:8000
(handles MCP protocol communication) - Web Interface: Available at
http://localhost:8080
for management - Best for: Most users who want both MCP functionality and web management
Important: Always use
uv run
to ensure the correct virtual environment is used.
Starts only the MCP server without web interface:
uv run python start.py --mode mcp
- Use case: Production deployments where only MCP protocol is needed
- Lighter resource usage: No web interface overhead
- Best for: Headless servers, CI/CD environments
Starts only the web management interface:
uv run python start.py --mode web
- Use case: Administrative tasks, configuration management
- Web Interface: Available at
http://localhost:8080
- Best for: Configuration, monitoring, and testing without MCP protocol
# Debug mode with auto-reload
uv run python start.py --debug
# Custom configuration file
uv run python start.py --config custom-config.yaml
# Skip dependency check (faster startup)
uv run python start.py --no-deps-check
# Kill occupied ports before starting
uv run python start.py --kill-ports
# Skip database initialization
uv run python start.py --no-db-init
uv run python start.py --help
# Available options:
# --mode {mcp,web,hybrid} Startup mode (default: hybrid)
# --config CONFIG Configuration file path (default: config.yaml)
# --debug Enable debug mode
# --no-deps-check Skip dependency check
# --no-db-init Skip database initialization
# --kill-ports Kill occupied ports before starting
Access the web interface at http://localhost:8080
to:
- Configure document sources
- Manage users and API keys
- Monitor system health
- View logs and metrics
- Test MCP endpoints
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
# Connect to AnyDocs MCP Server
async with stdio_client(StdioServerParameters(
command="python",
args=["main.py"]
)) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# List available tools
tools = await session.list_tools()
# Search documents
result = await session.call_tool(
"search_documents",
arguments={"query": "authentication", "limit": 10}
)
Platform | Status | Features |
---|---|---|
Any Website | ✅ | Universal scraper for any documentation site |
GitBook | ✅ | Full API integration, real-time sync |
Notion | ✅ | Database and page content, webhooks |
Confluence | ✅ | Space and page management, attachments |
GitHub | ✅ | Repository documentation, wikis |
GitLab | ✅ | Project documentation, wikis |
SharePoint | ✅ | Document libraries, lists |
Slack | ✅ | Channel messages, knowledge base |
File System | ✅ | Local markdown files, watch mode |
Custom | 🔧 | Extensible adapter framework |
from anydocs_mcp.adapters.base import BaseDocumentAdapter
class CustomAdapter(BaseDocumentAdapter):
"""Custom documentation adapter implementation."""
async def fetch_documents(self) -> List[Document]:
"""Fetch documents from your platform."""
# Implementation here
pass
async def get_document_content(self, doc_id: str) -> str:
"""Get specific document content."""
# Implementation here
pass
Python-Jose Import Error: If you encounter No module named 'jose'
error:
# Always use uv run to ensure correct virtual environment
uv run python start.py
# If the issue persists, reinstall python-jose with cryptography extras
uv pip uninstall python-jose
uv pip install "python-jose[cryptography]"
Dependency Check Failures: If you see errors about missing pyyaml
or beautifulsoup4
:
# The dependency check has been fixed to use correct import names
# Ensure you're using the latest version
git pull origin main
uv pip install -e .
Configuration Attribute Errors: If you see 'AppConfig' object has no attribute 'server_host'
:
# This is fixed in the current version - ensure you have the latest code
git pull origin main
Virtual Environment Issues: If packages seem installed but imports fail:
# Always use 'uv run' to ensure correct environment
uv run python start.py
# Check if you're in the right environment
which python # Should point to your project's Python
uv pip list # Should show installed packages
Setup.py Conflicts: If you encounter conflicts with multiple setup files:
# The redundant root setup.py has been removed
# Use pyproject.toml for package management
uv pip install -e .
- Always use
uv run
for executing Python scripts to ensure correct environment - Use
uv pip install
instead ofuv install
for package installation - Check virtual environment with
uv pip list
if imports fail - Pull latest changes if you encounter configuration issues
If ports 8000 or 8080 are occupied:
# Kill processes on required ports (Windows)
netstat -ano | findstr :8000
taskkill /F /PID <PID>
# Or use the built-in option
uv run python start.py --kill-ports
# Server Configuration
ANYDOCS_HOST=localhost
ANYDOCS_PORT=8000
ANYDOCS_WEB_PORT=8080
ANYDOCS_DEBUG=false
# Database
DATABASE_URL=sqlite:///data/anydocs.db
# DATABASE_URL=postgresql://user:pass@localhost/anydocs
# Authentication
JWT_SECRET_KEY=your-secret-key
API_KEY_PREFIX=anydocs_
# Document Adapters
GITBOOK_API_TOKEN=your-gitbook-token
NOTION_API_TOKEN=your-notion-token
CONFLUENCE_API_TOKEN=your-confluence-token
# Cache (Optional)
REDIS_URL=redis://localhost:6379/0
# Monitoring
ENABLE_METRICS=true
LOG_LEVEL=INFO
# config.yaml
server:
host: localhost
port: 8000
web_port: 8080
debug: false
database:
url: sqlite:///data/anydocs.db
pool_size: 10
echo: false
auth:
jwt_secret: ${JWT_SECRET_KEY}
token_expire_minutes: 1440
api_key:
prefix: anydocs_
length: 32
adapters:
gitbook:
api_token: ${GITBOOK_API_TOKEN}
base_url: https://api.gitbook.com
rate_limit: 100
notion:
api_token: ${NOTION_API_TOKEN}
version: "2022-06-28"
rate_limit: 3
AnyDocs MCP Server provides the following tools:
- search_documents - Search documents with full-text and semantic search
- get_document - Retrieve a specific document by ID
- list_sources - List all configured document sources
- summarize_content - Summarize document content
- ask_question - Ask questions about document content
- generate_documentation - AI-assisted documentation generation
- translate_content - Multi-language content translation
- extract_insights - Extract insights and analytics from documentation
- suggest_improvements - AI-powered content enhancement suggestions
# Install development dependencies
uv pip install -e .[dev]
# Setup pre-commit hooks
uv run pre-commit install
# Run tests
uv run pytest
# Run with coverage
uv run pytest --cov=src/anydocs_mcp
# Code formatting
uv run black src/ tests/
uv run isort src/ tests/
# Type checking
uv run mypy src/
# Security checks
uv run bandit -r src/
anydocs-mcp/
├── src/anydocs_mcp/ # Main package
│ ├── adapters/ # Document adapters
│ ├── auth/ # Authentication
│ ├── config/ # Configuration management
│ ├── content/ # Content processing
│ ├── database/ # Database models and operations
│ ├── utils/ # Utilities and helpers
│ ├── web/ # Web interface
│ └── server.py # MCP server implementation
├── tests/ # Test suite
├── docs/ # Documentation
├── scripts/ # Utility scripts
│ └── setup.py # Development environment setup
├── pyproject.toml # Package configuration (modern Python packaging)
├── start.py # Main startup script
└── main.py # MCP server entry point
Note: The project uses
pyproject.toml
for package configuration following modern Python packaging standards. The redundant rootsetup.py
has been removed to avoid conflicts.
# All tests
uv run pytest
# Unit tests only
uv run pytest tests/unit/
# Integration tests only
uv run pytest tests/integration/
# With coverage
uv run pytest --cov=src/anydocs_mcp --cov-report=html
# Performance tests
uv run pytest tests/performance/
# Check service health
curl http://localhost:8080/health
# Detailed health check
curl http://localhost:8080/health/detailed
Metrics are available at /metrics
endpoint in Prometheus format:
- Request count and duration
- Database connection pool status
- Document sync statistics
- Error rates and types
- Cache hit/miss ratios
Structured logging with configurable levels:
# Application logs
tail -f anydocs_mcp.log
# Check logs in real-time
uv run python start.py --debug
# Start development environment
docker-compose -f docker-compose.dev.yml up -d
# View logs
docker-compose logs -f anydocs-mcp-dev
# Access shell
docker-compose exec anydocs-mcp-dev bash
# Build and start production environment
docker-compose up -d
# Scale services
docker-compose up -d --scale anydocs-mcp=3
# Update services
docker-compose pull && docker-compose up -d
# Start with monitoring
docker-compose --profile monitoring up -d
# Access services
# Grafana: http://localhost:3001 (admin/admin)
# Prometheus: http://localhost:9090
- API Keys: Simple token-based authentication
- JWT Tokens: Stateless authentication with expiration
- OAuth2: Integration with external providers
- All API endpoints require authentication
- Rate limiting on all endpoints
- Input validation and sanitization
- SQL injection prevention
- CORS configuration
- Security headers
- Audit logging
# Run security checks
uv run bandit -r src/
# Dependency vulnerability scan
uv run safety check
# SAST scanning
uv run bandit -r src/
- Async/Await: Non-blocking I/O operations
- Connection Pooling: Efficient database connections
- Caching: Redis-based caching with TTL
- Rate Limiting: Prevent API abuse
- Batch Processing: Efficient bulk operations
- Lazy Loading: On-demand content loading
# Performance testing
uv run pytest tests/performance/
# Memory profiling
uv run python -m memory_profiler start.py
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Run the test suite
- Submit a pull request
- Follow PEP 8 style guide
- Use type hints
- Write comprehensive tests
- Document public APIs
- Use meaningful commit messages
This project is licensed under the MIT License - see the LICENSE file for details.
- Model Context Protocol for the MCP specification
- Python MCP SDK for the SDK implementation
- All contributors and maintainers
- 📧 Email: [email protected]
- 🐛 Issues: GitHub Issues
- 📖 Documentation: Full Documentation
Made with ❤️ by funky1688