Thanks to visit codestin.com
Credit goes to github.com

Skip to content

pereirrd/mcp-just-seek-knowledge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PortuguΓͺs | EspaΓ±ol

mcp-just-seek-knowledge

MCP (Model Context Protocol) server that stores and searches AI-generated knowledge about software projects, allowing Cursor to access information about project structures, design patterns, best practices, and technical documentation.


πŸ“‹ About the Project

Objective

Create an MCP server that stores and searches AI-generated knowledge about software projects.

Technology Stack

  • Language: Python
  • Embedding Framework: LangChain
  • Database: PostgreSQL with pgVector
  • Protocol: MCP (Model Context Protocol) for Cursor integration

Main Features

  1. Ingest: Create new records in the knowledge base
  2. Update: Update existing records in the knowledge base
  3. Search: Semantic search in the database
  4. List Catalog: List all existing service_name in the database (exposed as MCP tool)
  5. Delete: Delete records by service_name (available via CLI script, not exposed as MCP tool)

πŸ› οΈ Environment Setup

Complete Setup Process

1. Clone the project or navigate to it (if needed)

cd /home/pereirrd/dev/git/pereirrd/mcp-just-seek-knowledge

2. Create and activate virtual environment

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
# On Linux/WSL:
source venv/bin/activate

# On Windows:
# venv\Scripts\activate

3. Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

4. Configure environment variables

Create a .env file in the project root (copy from .env.example if it exists, or create manually):

# Example .env
PGVECTOR_URL=postgresql://postgres:postgres@localhost:5433/software_design_knowledge
POSTGRES_HOST=localhost
POSTGRES_PORT=5433
POSTGRES_DB=software_design_knowledge
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
OPENAI_API_KEY=your_openai_api_key
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSION=1536

Note: PostgreSQL variables can also be configured in Cursor's mcp.json (see section below).

5. Start PostgreSQL (if using Docker Compose)

docker-compose up -d

This will create PostgreSQL with pgvector automatically on port 5433.

Important: If port 5432 is already in use, docker-compose.yml is configured to automatically use port 5433.

6. Test the MCP server (optional)

python src/mcp_server.py

The server should start without errors and automatically create the software_design_knowledge table if it doesn't exist.

Verify Installation

To verify if dependencies were installed correctly:

pip list | grep -E "langchain|psycopg|openai|python-dotenv"

Or test imports directly:

python -c "from src.database.connection import get_connection_string; from src.mcp.mcp_server import MCPServer; print('βœ… Dependencies installed correctly!')"

βš™οΈ Cursor Configuration

To add this MCP server to Cursor, configure the ~/.cursor/mcp.json file (global configuration) or .cursor/mcp.json in the project root (local configuration).

Example configuration (~/.cursor/mcp.json):

{
  "mcpServers": {
    "mcp-just-seek-knowledge": {
      "command": "python",
      "args": ["/absolute/path/to/project/src/mcp_server.py"],
      "env": {
        "OPENAI_API_KEY": "your_openai_api_key",
        "OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
        "EMBEDDING_DIMENSION": "1536"
      }
    }
  }
}

Important:

  • Use absolute paths in the args field
  • Configure all necessary environment variables
  • Cursor loads this file automatically on startup
  • After adding, restart Cursor to load the MCP server

Note about Cursor

When configuring MCP in Cursor (~/.cursor/mcp.json), Cursor will use the system Python or the one active in PATH. Recommendations:

Option 1: Use global Python (install dependencies globally)

If you prefer to use the system's global Python:

pip install -r requirements.txt

And configure mcp.json with:

{
  "mcpServers": {
    "mcp-just-seek-knowledge": {
      "command": "python",
      "args": ["/absolute/path/to/project/src/mcp_server.py"],
      "env": {
        "OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
        "EMBEDDING_DIMENSION": "1536"
      }
    }
  }
}

Option 2: Use virtual environment Python (recommended)

To use the project's virtual environment, specify the full path to the venv Python in mcp.json:

{
  "mcpServers": {
    "mcp-just-seek-knowledge": {
      "command": "/absolute/path/to/mcp-just-seek-knowledge/venv/bin/python",
      "args": ["/absolute/path/to/mcp-just-seek-knowledge/src/mcp_server.py"],
      "env": {
        "OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
        "EMBEDDING_DIMENSION": "1536"
      }
    }
  }
}

Advantages of Option 2:

  • Isolates project dependencies
  • Avoids conflicts with other Python projects
  • Facilitates version management

Note: The project's .env file will be automatically loaded by the MCP server, so you don't need to repeat PostgreSQL variables in mcp.json (unless you prefer).


πŸš€ Implementation

Preparation and Structure

Directory Structure

Created src/ structure with organized subdirectories:

  • src/database/ - Database management
  • src/embeddings/ - Embedding services
  • src/services/ - Business services (ingest, update, search)
  • src/mcp/ - MCP server and handlers

__init__.py files created in all Python packages.

Dependency Configuration

requirements.txt file created with all necessary dependencies:

  • LangChain Framework: langchain, langchain-community, langchain-core, langchain-openai, langchain-postgres
  • PostgreSQL: psycopg, pgvector
  • OpenAI: openai
  • Utilities: python-dotenv

Environment Variables

.env.example file created with all necessary variables:

  • PGVECTOR_URL - PostgreSQL connection URL
  • POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD
  • OPENAI_API_KEY, OPENAI_EMBEDDING_MODEL
  • EMBEDDING_DIMENSION

.gitignore file configured to exclude .env and Python and IDE files.

Docker and PostgreSQL

docker-compose.yml file created with:

  • PostgreSQL service using pgvector/pgvector:pg16 image
  • Volume configuration for persistence
  • Healthcheck configured
  • Ports and environment variables configured

Initialization script init-scripts/01-init-pgvector.sh to automatically create the pgvector extension.


Database Configuration

Database Schema (src/database/schema.py)

Structure of software_design_knowledge table (software project knowledge):

  • id - Unique identifier (SERIAL PRIMARY KEY)
  • service_name - Service name (VARCHAR(255) NOT NULL UNIQUE)
  • content - Knowledge content (TEXT NOT NULL)
  • embedding - Embedding vector (vector(1536) NOT NULL)
  • metadata - Additional metadata (JSONB)
  • created_at - Creation date (TIMESTAMP DEFAULT CURRENT_TIMESTAMP)
  • updated_at - Update date (TIMESTAMP DEFAULT CURRENT_TIMESTAMP)

Indexes:

  • IVFFlat index for optimized vector search
  • Index for service_name for service searches

Triggers:

  • Automatic trigger to update updated_at on updates

Connection Management (src/database/connection.py)

Implemented functions:

  • get_connection_string() - Gets connection string from environment variables
  • create_connection() - Creates PostgreSQL connections
  • schema_exists() - Checks if table exists
  • create_schema() - Creates complete schema (table, indexes, triggers)
  • initialize_database() - Initializes the database

Error handling and logging implemented.

Data Repository (src/database/repository.py)

KnowledgeRepository class implemented using psycopg directly.

Implemented methods:

  • insert() - Insert document into database
  • update() - Update document by service_name
  • upsert() - Insert or update (upsert behavior)
  • delete() - Delete document by service_name
  • get_by_service_name() - Search document by service_name
  • similarity_search() - Semantic search using pgVector (<=> operator)

Features:

  • Support for optional filters (similarity threshold, service_name filter)
  • Integration with JSONB metadata structure

Embedding Services

EmbeddingService class (src/embeddings/embedding_service.py) using OpenAIEmbeddings from LangChain.

Features:

  • Single and batch embedding creation
  • Configuration via environment variables (default model: text-embedding-3-small)
  • Error handling and logging

Business Services

Four main services implemented:

Ingest Service (src/services/ingest_service.py)

  • Adds new knowledge to the database
  • Validates service_name and content
  • Automatically creates embedding
  • Complete error handling

Update Service (src/services/update_service.py)

  • Updates existing knowledge (upsert behavior)
  • If service_name doesn't exist, creates new record
  • If exists, updates existing record
  • Automatically updates embedding

Search Service (src/services/search_service.py)

  • Semantic search by similarity
  • Optional parameters: k (number of results), threshold (minimum similarity), service_name (filter)
  • Returns results ordered by relevance

List Catalog Service (src/services/list_catalog_service.py)

  • Lists all existing service_name in the database
  • Does not use embeddings (repository only)

Common features:

  • Integration with EmbeddingService and KnowledgeRepository
  • Input validation
  • Error handling
  • Detailed logging
  • Structured returns

πŸ—‘οΈ CLI Scripts

Record Deletion

The project includes a CLI script for record deletion that is not exposed as an MCP tool. This functionality is only available via command line for administrative operations.

Script: src/database/delete_service.py

Functionality:

  • Deletes a record from the knowledge base by service_name
  • Validates record existence before deletion
  • Provides clear feedback on operation result

Usage:

python src/database/delete_service.py <service_name>

Examples:

# Delete a specific service
python src/database/delete_service.py user-service

# The script returns:
# - βœ“ "Record deleted successfully" if the record was found and removed
# - βœ— "Record not found" if the service_name doesn't exist
# - βœ— "Error deleting record" in case of operation failure

Features:

  • Parameter validation (service_name cannot be empty)
  • Error handling with detailed logging
  • Appropriate exit codes (0 for success, 1 for failure)
  • Clear feedback messages for the user

Note: This functionality is not available as an MCP tool for security and access control reasons. Use only for necessary administrative operations.


πŸ“š pgvector Initialization Script

The init-scripts/01-init-pgvector.sh script is automatically used by PostgreSQL during container initialization.

How it works

1. Volume mapped in docker-compose.yml

The local init-scripts/ directory is mapped to /docker-entrypoint-initdb.d inside the container through volume configuration in docker-compose.yml.

2. PostgreSQL automatic behavior

The official PostgreSQL image (including pgvector/pgvector) automatically executes all files present in /docker-entrypoint-initdb.d when:

  • The database is initialized for the first time (when the data volume is empty)
  • Files are executed in alphabetical order (hence the 01- prefix)
  • Accepts .sql, .sh and other executable files

3. What the script does

The 01-init-pgvector.sh script:

  • Executes CREATE EXTENSION IF NOT EXISTS vector; to create the pgvector extension
  • Lists installed extensions for verification
  • Uses set -e to stop on error

Important

  • Scripts in init-scripts/ are only executed on first initialization (when volume is empty)
  • If the container has been started before, the script will not be executed again
  • To re-execute, it's necessary to remove the volume: docker-compose down -v

⌨️ Cursor Commands (Slash Commands)

This repository includes custom Cursor commands in .cursor/commands/, which help create, update, and list the knowledge base in the MCP mcp-just-seek-knowledge.

Available commands

  • /criar_base_conhecimento: analyzes the entire open workspace (all projects/directories), reads documentation (including Swagger/OpenAPI) and creates a unique record for the workspace using mcp-just-seek-knowledge.ingest.
  • /atualizar_base_conhecimento: same analysis as the previous command, but updates (upsert) the workspace record using mcp-just-seek-knowledge.update.
  • /listar_base_conhecimento: lists existing service_name via mcp-just-seek-knowledge.list_catalog and presents a friendly layout with count, service_name and metadata (enriched via mcp-just-seek-knowledge.search).

How to use

  1. Ensure the MCP mcp-just-seek-knowledge is configured in Cursor (~/.cursor/mcp.json or .cursor/mcp.json).
  2. Open the project(s) in the Cursor workspace.
  3. In Cursor chat, execute a command by typing:
    • /criar_base_conhecimento
    • /atualizar_base_conhecimento
    • /listar_base_conhecimento

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

 
 
 

Contributors