MCP (Model Context Protocol) server that stores and searches AI-generated knowledge about software projects, allowing Cursor to access information about project structures, design patterns, best practices, and technical documentation.
Create an MCP server that stores and searches AI-generated knowledge about software projects.
- Language: Python
- Embedding Framework: LangChain
- Database: PostgreSQL with pgVector
- Protocol: MCP (Model Context Protocol) for Cursor integration
- Ingest: Create new records in the knowledge base
- Update: Update existing records in the knowledge base
- Search: Semantic search in the database
- List Catalog: List all existing
service_namein the database (exposed as MCP tool) - Delete: Delete records by
service_name(available via CLI script, not exposed as MCP tool)
cd /home/pereirrd/dev/git/pereirrd/mcp-just-seek-knowledge# Create virtual environment
python3 -m venv venv
# Activate virtual environment
# On Linux/WSL:
source venv/bin/activate
# On Windows:
# venv\Scripts\activatepip install --upgrade pip
pip install -r requirements.txtCreate a .env file in the project root (copy from .env.example if it exists, or create manually):
# Example .env
PGVECTOR_URL=postgresql://postgres:postgres@localhost:5433/software_design_knowledge
POSTGRES_HOST=localhost
POSTGRES_PORT=5433
POSTGRES_DB=software_design_knowledge
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
OPENAI_API_KEY=your_openai_api_key
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSION=1536Note: PostgreSQL variables can also be configured in Cursor's mcp.json (see section below).
docker-compose up -dThis will create PostgreSQL with pgvector automatically on port 5433.
Important: If port 5432 is already in use, docker-compose.yml is configured to automatically use port 5433.
python src/mcp_server.pyThe server should start without errors and automatically create the software_design_knowledge table if it doesn't exist.
To verify if dependencies were installed correctly:
pip list | grep -E "langchain|psycopg|openai|python-dotenv"Or test imports directly:
python -c "from src.database.connection import get_connection_string; from src.mcp.mcp_server import MCPServer; print('β
Dependencies installed correctly!')"To add this MCP server to Cursor, configure the ~/.cursor/mcp.json file (global configuration) or .cursor/mcp.json in the project root (local configuration).
{
"mcpServers": {
"mcp-just-seek-knowledge": {
"command": "python",
"args": ["/absolute/path/to/project/src/mcp_server.py"],
"env": {
"OPENAI_API_KEY": "your_openai_api_key",
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
"EMBEDDING_DIMENSION": "1536"
}
}
}
}Important:
- Use absolute paths in the
argsfield - Configure all necessary environment variables
- Cursor loads this file automatically on startup
- After adding, restart Cursor to load the MCP server
When configuring MCP in Cursor (~/.cursor/mcp.json), Cursor will use the system Python or the one active in PATH. Recommendations:
If you prefer to use the system's global Python:
pip install -r requirements.txtAnd configure mcp.json with:
{
"mcpServers": {
"mcp-just-seek-knowledge": {
"command": "python",
"args": ["/absolute/path/to/project/src/mcp_server.py"],
"env": {
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
"EMBEDDING_DIMENSION": "1536"
}
}
}
}To use the project's virtual environment, specify the full path to the venv Python in mcp.json:
{
"mcpServers": {
"mcp-just-seek-knowledge": {
"command": "/absolute/path/to/mcp-just-seek-knowledge/venv/bin/python",
"args": ["/absolute/path/to/mcp-just-seek-knowledge/src/mcp_server.py"],
"env": {
"OPENAI_EMBEDDING_MODEL": "text-embedding-3-small",
"EMBEDDING_DIMENSION": "1536"
}
}
}
}Advantages of Option 2:
- Isolates project dependencies
- Avoids conflicts with other Python projects
- Facilitates version management
Note: The project's .env file will be automatically loaded by the MCP server, so you don't need to repeat PostgreSQL variables in mcp.json (unless you prefer).
Created src/ structure with organized subdirectories:
src/database/- Database managementsrc/embeddings/- Embedding servicessrc/services/- Business services (ingest, update, search)src/mcp/- MCP server and handlers
__init__.py files created in all Python packages.
requirements.txt file created with all necessary dependencies:
- LangChain Framework: langchain, langchain-community, langchain-core, langchain-openai, langchain-postgres
- PostgreSQL: psycopg, pgvector
- OpenAI: openai
- Utilities: python-dotenv
.env.example file created with all necessary variables:
PGVECTOR_URL- PostgreSQL connection URLPOSTGRES_DB,POSTGRES_USER,POSTGRES_PASSWORDOPENAI_API_KEY,OPENAI_EMBEDDING_MODELEMBEDDING_DIMENSION
.gitignore file configured to exclude .env and Python and IDE files.
docker-compose.yml file created with:
- PostgreSQL service using
pgvector/pgvector:pg16image - Volume configuration for persistence
- Healthcheck configured
- Ports and environment variables configured
Initialization script init-scripts/01-init-pgvector.sh to automatically create the pgvector extension.
Structure of software_design_knowledge table (software project knowledge):
id- Unique identifier (SERIAL PRIMARY KEY)service_name- Service name (VARCHAR(255) NOT NULL UNIQUE)content- Knowledge content (TEXT NOT NULL)embedding- Embedding vector (vector(1536) NOT NULL)metadata- Additional metadata (JSONB)created_at- Creation date (TIMESTAMP DEFAULT CURRENT_TIMESTAMP)updated_at- Update date (TIMESTAMP DEFAULT CURRENT_TIMESTAMP)
Indexes:
- IVFFlat index for optimized vector search
- Index for
service_namefor service searches
Triggers:
- Automatic trigger to update
updated_aton updates
Implemented functions:
get_connection_string()- Gets connection string from environment variablescreate_connection()- Creates PostgreSQL connectionsschema_exists()- Checks if table existscreate_schema()- Creates complete schema (table, indexes, triggers)initialize_database()- Initializes the database
Error handling and logging implemented.
KnowledgeRepository class implemented using psycopg directly.
Implemented methods:
insert()- Insert document into databaseupdate()- Update document by service_nameupsert()- Insert or update (upsert behavior)delete()- Delete document by service_nameget_by_service_name()- Search document by service_namesimilarity_search()- Semantic search using pgVector (<=>operator)
Features:
- Support for optional filters (similarity threshold, service_name filter)
- Integration with JSONB metadata structure
EmbeddingService class (src/embeddings/embedding_service.py) using OpenAIEmbeddings from LangChain.
Features:
- Single and batch embedding creation
- Configuration via environment variables (default model:
text-embedding-3-small) - Error handling and logging
Four main services implemented:
- Adds new knowledge to the database
- Validates
service_nameandcontent - Automatically creates embedding
- Complete error handling
- Updates existing knowledge (upsert behavior)
- If
service_namedoesn't exist, creates new record - If exists, updates existing record
- Automatically updates embedding
- Semantic search by similarity
- Optional parameters:
k(number of results),threshold(minimum similarity),service_name(filter) - Returns results ordered by relevance
- Lists all existing
service_namein the database - Does not use embeddings (repository only)
Common features:
- Integration with
EmbeddingServiceandKnowledgeRepository - Input validation
- Error handling
- Detailed logging
- Structured returns
The project includes a CLI script for record deletion that is not exposed as an MCP tool. This functionality is only available via command line for administrative operations.
Functionality:
- Deletes a record from the knowledge base by
service_name - Validates record existence before deletion
- Provides clear feedback on operation result
Usage:
python src/database/delete_service.py <service_name>Examples:
# Delete a specific service
python src/database/delete_service.py user-service
# The script returns:
# - β "Record deleted successfully" if the record was found and removed
# - β "Record not found" if the service_name doesn't exist
# - β "Error deleting record" in case of operation failureFeatures:
- Parameter validation (service_name cannot be empty)
- Error handling with detailed logging
- Appropriate exit codes (0 for success, 1 for failure)
- Clear feedback messages for the user
Note: This functionality is not available as an MCP tool for security and access control reasons. Use only for necessary administrative operations.
The init-scripts/01-init-pgvector.sh script is automatically used by PostgreSQL during container initialization.
1. Volume mapped in docker-compose.yml
The local init-scripts/ directory is mapped to /docker-entrypoint-initdb.d inside the container through volume configuration in docker-compose.yml.
2. PostgreSQL automatic behavior
The official PostgreSQL image (including pgvector/pgvector) automatically executes all files present in /docker-entrypoint-initdb.d when:
- The database is initialized for the first time (when the data volume is empty)
- Files are executed in alphabetical order (hence the 01- prefix)
- Accepts .sql, .sh and other executable files
3. What the script does
The 01-init-pgvector.sh script:
- Executes
CREATE EXTENSION IF NOT EXISTS vector;to create the pgvector extension - Lists installed extensions for verification
- Uses
set -eto stop on error
- Scripts in
init-scripts/are only executed on first initialization (when volume is empty) - If the container has been started before, the script will not be executed again
- To re-execute, it's necessary to remove the volume:
docker-compose down -v
This repository includes custom Cursor commands in .cursor/commands/, which help create, update, and list the knowledge base in the MCP mcp-just-seek-knowledge.
/criar_base_conhecimento: analyzes the entire open workspace (all projects/directories), reads documentation (including Swagger/OpenAPI) and creates a unique record for the workspace usingmcp-just-seek-knowledge.ingest./atualizar_base_conhecimento: same analysis as the previous command, but updates (upsert) the workspace record usingmcp-just-seek-knowledge.update./listar_base_conhecimento: lists existingservice_nameviamcp-just-seek-knowledge.list_catalogand presents a friendly layout withcount,service_nameandmetadata(enriched viamcp-just-seek-knowledge.search).
- Ensure the MCP
mcp-just-seek-knowledgeis configured in Cursor (~/.cursor/mcp.jsonor.cursor/mcp.json). - Open the project(s) in the Cursor workspace.
- In Cursor chat, execute a command by typing:
/criar_base_conhecimento/atualizar_base_conhecimento/listar_base_conhecimento