Thanks to visit codestin.com
Credit goes to github.com

Skip to content

n24q02m/wet-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

975 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

WET - Web Extended Toolkit MCP Server

mcp-name: io.github.n24q02m/wet-mcp

Web search, content extraction, and library docs for AI agents -- 5-strategy scraping, runs without API keys.

Phase Status Scope
Phase 1 Shipped web-core ScrapingAgent migration, smart chunks output, search polish, media slim
Phase 2 Shipped Context7-level docs search: library index (Tier 1 + Tier 2), version-aware queries with token cap, project lock (Cabinets)
Phase 3 Shipped extract.agent multi-step research with cited synthesis, extract.interact click/fill/submit via patchright (optional session persistence), docs_004_chunk_summaries migration, media.analyze removed (v2.0.0)

Current release: v3.x. media(action="analyze") was removed in the v2.0.0 BREAKING release. Use imagine-mcp's understand action for vision/audio/video analysis. See docs/migration.md for the upgrade recipe.

CI codecov PyPI Docker License: MIT

Python SearXNG MCP semantic-release Renovate

Sister projects from n24q02m (click to expand)
Project Tagline Tag
better-code-review-graph Knowledge graph for token-efficient code reviews -- semantic search and call-... MCP
better-email-mcp IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att... MCP
better-godot-mcp Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g... MCP
better-notion-mcp Markdown-first Notion for AI agents -- pages, databases, blocks, and comments... MCP
better-telegram-mcp Telegram for AI agents -- messages, chats, media, and contacts across both bo... MCP
claude-plugins Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea... Marketplace
imagine-mcp Image and video understanding + generation for AI agents -- across Gemini, Op... MCP
jules-task-archiver Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... Tooling
mcp-core Shared foundation for building MCP servers -- Streamable HTTP transport, OAut... MCP
mnemo-mcp Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... MCP
qwen3-embed Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF Library
skret Secrets without the server. CLI
tacet TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn... Tooling
web-core Shared web infrastructure package for search, scraping, HTTP security, and st... Library
wet-mcp Open-source MCP server for AI agents: web search, content extraction, and lib... MCP

Table of contents

WET MCP server

Features

  • Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with query expansion, TTL cache (1 h general / 5 min time-sensitive), standardized citation format, and 200-token snippet cap
  • Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
  • Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
  • Content Extract -- 5-strategy escalation chain via n24q02m-web-core ScrapingAgent (basic_http -> tls_spoof -> headless Crawl4AI), markitdown bridge for low-tier HTML/MD fallback, smart chunks structured output (clean text + markdown + JSON-LD + code blocks + metadata), batch processing (up to 50 URLs), deep crawling, site mapping
  • Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
  • Media -- List + download images / videos / audio files. analyze deprecated v<auto>+ -- use imagine-mcp.understand for vision/audio inference
  • Anti-bot -- Stealth strategies bypass Cloudflare, Medium, LinkedIn, Twitter
  • Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere) for higher-quality vectors
  • Sync -- Cross-machine sync of indexed docs via Google Drive (OAuth Device Code, no browser redirect)

Quick install

# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@n24q02m-plugins

# Method 1 (CLI): direct uvx invocation
claude mcp add wet -- uvx wet-mcp

# Method 3 (recommended for HTTP / multi-device / OAuth)
docker run -d --name wet-mcp-http -p 8084:8080 \
  -v wet-data:/data -e MCP_TRANSPORT=http \
  -e PUBLIC_URL=https://wet.example.com \
  n24q02m/wet-mcp:latest

Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/wet-mcp/setup/ and the paste-to-agent snippets at claude-plugins/plugins/wet-mcp/setup-with-agent.md (per Spec F single source of truth).

Status

2026-05-02 -- Architecture stabilization update

Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. As of v<auto>, the architecture is stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.

Apologies for the instability period. If you encountered issues with prior versions, please update to v<auto>+ and follow the current setup docs -- most prior workarounds are no longer needed.

Related plugins from the same author:

All plugins share the same architecture (this spec) -- install once, learn pattern transfers.

Documentation

Full docs at mcp.n24q02m.com/servers/wet-mcp/setup/:

  • Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
  • Modes overview -- stdio / local-relay / remote-relay / remote-oauth
  • Multi-user setup -- per-JWT-sub credential model

In-repo references (Spec F single source of truth: setup docs live in claude-plugins/plugins/wet-mcp/):

  • docs/ARCHITECTURE.md -- web-core ScrapingAgent integration, strategy chain, storage layout, LLM provider dispatch
  • docs/BENCHMARKS.md -- v1.x baseline coverage / latency placeholders + tier-1 fixture metrics

Install with AI agent -- paste this to your AI coding agent:

Install MCP server wet-mcp following the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/wet-mcp/setup-with-agent.md

Tools

6 MCP tools (3 domain + config + help + config__open_relay). The legacy setup tool merged into config action dispatch.

Tool Description
search Web (SearXNG metasearch), news, images, academic research (Scholar / arXiv / PubMed / CrossRef / Semantic Scholar / BASE), library docs (HyDE + FTS5), find similar pages. Includes docs_resolve (library name -> ranked id), docs_query (version-aware + topic + 5000-token cap), docs_lock_project (Cabinets project pin via pyproject / package.json / go.mod / Cargo.toml manifest detection).
extract URL -> smart chunks dict (clean_text + markdown + structured_data + code_blocks + metadata) via web-core 5-strategy chain. Batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion (PDF/DOCX/XLSX/PPTX/EPUB), structured extraction (JSON Schema)
media list (discover URLs from gallery pages), download (SSRF-safe). analyze deprecated v<auto>+ -- forwards to imagine-mcp.understand
config status, set, cache_clear, docs_reindex, warmup, setup_sync, setup_status, setup_skip, setup_reset, setup_complete
help Per-tool documentation: search, extract, media, config
config__open_relay Re-trigger the zero-config relay setup flow (prints a fresh relay URL for the browser form). Registered via mcp-core's register_open_relay_tool so an LLM can restart setup without a manual restart.

Media boundary: For vision / audio understanding (image captioning, OCR, audio transcription, video summarization), use imagine-mcp. media.analyze was removed in wet v2.0.0 -- use imagine-mcp.understand instead.

Comparison

How wet-mcp stacks up against direct competitors in each pillar:

Capability wet-mcp Brave Search Tavily Firecrawl Context7
Web search Yes (SearXNG aggregation) Yes Yes No No
Extract URL Yes (5-strategy chain) No Yes (basic) Yes No
Media list / download Yes No No No No
Library docs search Yes (Tier 1 curated + Tier 2 on-demand, version-aware, Cabinets) No No No Yes
Academic research Yes (6 providers) No No No No
Self-hostable Yes No No No Yes
Free tier Yes (open source) Limited Limited Limited Yes

Security

  • SSRF prevention -- URL validation on crawl targets
  • Graceful fallbacks -- Cloud → Local embedding, multi-tier crawling
  • Error sanitization -- No credentials in error messages
  • File conversion sandboxing -- Optional CONVERT_ALLOWED_DIRS restriction

Build from Source

git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp

Deploy to Cloudflare

Deploy to Cloudflare

Run your own single-user wet instance serverless on Cloudflare (Containers + D1 + Vectorize + KV).

Prerequisites: a Cloudflare account on the Workers Paid plan and the wrangler CLI.

  1. git clone https://github.com/n24q02m/wet-mcp && cd wet-mcp
  2. wrangler login
  3. Provision resources and apply the D1 schema:
    wrangler d1 create wet-docs
    wrangler d1 execute wet-docs --file migrations/0001_init_wet.sql --remote
    wrangler vectorize create wet-docs-vectors --dimensions 768 --metric cosine
    wrangler kv namespace create wet-kv
    
    Paste the returned IDs into wrangler.jsonc.
  4. Push the container image to your Cloudflare managed registry (CF Containers cannot pull from external registries directly), then set <YOUR_ACCOUNT_ID> in wrangler.jsonc:
    docker pull ghcr.io/n24q02m/wet-mcp:beta
    docker tag ghcr.io/n24q02m/wet-mcp:beta wet-mcp:beta
    wrangler containers push wet-mcp:beta   # prints registry.cloudflare.com/<ACCOUNT_ID>/wet-mcp:beta
    
  5. Set secrets (use SEARXNG_URL with basic-auth userinfo, e.g. https://user:[email protected], or TAVILY_API_KEY if you set SEARCH_BACKEND=tavily):
    wrangler secret put CREDENTIAL_SECRET
    wrangler secret put JINA_AI_API_KEY
    wrangler secret put GOOGLE_VERTEX_EXPRESS_API_KEY
    wrangler secret put XAI_API_KEY
    wrangler secret put MCP_RELAY_PASSWORD
    wrangler secret put MCP_DCR_SERVER_SECRET
    wrangler secret put SEARXNG_URL
    
  6. wrangler deploy and complete setup in the browser relay form at your Worker domain.

Storage maps to Cloudflare via MCP_STORAGE_BACKEND=cf-kv (credentials/tokens, encrypted), DOCS_DB_BACKEND=cf-d1 (docs + BM25 full-text), and Vectorize (embeddings). Web search uses a SearXNG instance (SEARCH_BACKEND=searxng, SEARXNG_URL) or Tavily (SEARCH_BACKEND=tavily); embed/rerank are forced cloud via EMBEDDING_MODELS/RERANK_MODELS.

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core trust model for full classification.

Mode Storage Encryption Who can read your data?
stdio (default) ~/.wet-mcp/config.json AES-GCM, machine-bound key Only your OS user (file perm 0600)
HTTP self-host Same as stdio Same Only you (admin = user)

License

MIT -- See LICENSE.

About

Open-source MCP server for AI agents: web search, content extraction, and library docs -- 5-strategy scraping, runs without API keys.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

 

Packages

 
 
 

Contributors

Languages