A Windows-only Model Context Protocol (MCP) server that enables AI agents to capture screenshots of applications or the entire screen, with optional visual question answering through local or remote AI models.
- Full Screen Capture: Capture screenshots of the entire screen or specific monitors
- Window-Specific Capture: Target specific application windows by title or process name
- Window Enumeration: List all visible windows with their process information
- AI-Powered Analysis: Analyze screenshots using OpenAI, Anthropic Claude, or local models
- Multiple Image Formats: Support for PNG and JPEG output
- Windows Integration: Deep Windows API integration for reliable window targeting
- Windows operating system (Windows 10/11 recommended)
- Python 3.8 or higher
- Required Python packages (see requirements.txt)
-
Clone or download this repository
-
Install dependencies:
pip install -r requirements.txt
-
Set up API keys (optional, for AI analysis):
# For OpenAI set OPENAI_API_KEY=your_openai_api_key # For Anthropic Claude set ANTHROPIC_API_KEY=your_anthropic_api_key # For local models (e.g., Ollama) set LOCAL_MODEL_URL=http://localhost:11434
Run the MCP server:
python screenshot_server.pyCapture a screenshot of the entire screen or a specific monitor.
Parameters:
monitor(integer, optional): Monitor number (0 for primary, 1+ for additional)format(string, optional): Image format ("png" or "jpeg", default: "png")
Capture a screenshot of a specific application window.
Parameters:
window_title(string): Title or partial title of the windowprocess_name(string): Process name (e.g., "notepad.exe")format(string, optional): Image format ("png" or "jpeg", default: "png")
Note: Either window_title or process_name is required.
List all visible windows with their titles and process names.
Parameters: None
Analyze a screenshot using AI and answer questions about it.
Parameters:
image_data(string): Base64 encoded image dataquestion(string): Question to ask about the imagemodel_provider(string, optional): "openai", "anthropic", or "local" (default: "openai")model_name(string, optional): Specific model name (default: "gpt-4-vision-preview")
OPENAI_API_KEY: Your OpenAI API key for GPT-4 Vision analysisANTHROPIC_API_KEY: Your Anthropic API key for Claude analysisLOCAL_MODEL_URL: URL for local model API (default: http://localhost:11434)
gpt-4-vision-previewgpt-4ogpt-4o-mini
claude-3-sonnet-20240229claude-3-haiku-20240307claude-3-opus-20240229
- Any Ollama model with vision capabilities (e.g.,
llava,bakllava) - Custom local vision models with compatible API
# Capture entire screen
{"tool": "capture_screen", "arguments": {}}
# Capture specific monitor
{"tool": "capture_screen", "arguments": {"monitor": 1, "format": "jpeg"}}# Capture by window title
{"tool": "capture_window", "arguments": {"window_title": "Notepad"}}
# Capture by process name
{"tool": "capture_window", "arguments": {"process_name": "chrome.exe"}}# Analyze with OpenAI
{
"tool": "analyze_screenshot",
"arguments": {
"image_data": "base64_encoded_image_data",
"question": "What applications are visible in this screenshot?",
"model_provider": "openai"
}
}
# Analyze with local model
{
"tool": "analyze_screenshot",
"arguments": {
"image_data": "base64_encoded_image_data",
"question": "Describe what you see in this image",
"model_provider": "local",
"model_name": "llava"
}
}- This server requires Windows API access and can capture sensitive information
- Screenshots may contain private data - ensure proper handling
- API keys should be stored securely and not committed to version control
- Consider network security when using remote AI models
- Import Error for Windows modules: Ensure you're running on Windows
- Permission denied: Run as administrator if capturing system windows
- Window not found: Check window titles with
list_windowstool first - AI analysis fails: Verify API keys are set correctly
Enable debug logging by modifying the logging level in the script:
logging.basicConfig(level=logging.DEBUG)MIT License - see LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Test on Windows
- Submit a pull request