WET - Web Extended Toolkit MCP Server

mcp-name: io.github.n24q02m/wet-mcp

Web search, content extraction, and library docs for AI agents -- 5-strategy scraping, runs without API keys.

Phase	Status	Scope
Phase 1	Shipped	web-core ScrapingAgent migration, smart chunks output, search polish, media slim
Phase 2	Shipped	Context7-level docs search: library index (Tier 1 + Tier 2), version-aware queries with token cap, project lock (Cabinets)
Phase 3	Shipped	`extract.agent` multi-step research with cited synthesis, `extract.interact` click/fill/submit via patchright (optional session persistence), `docs_004_chunk_summaries` migration, `media.analyze` removed (v2.0.0)

Current release: v3.x. media(action="analyze") was removed in the v2.0.0 BREAKING release. Use imagine-mcp's understand action for vision/audio/video analysis. See docs/migration.md for the upgrade recipe.

Sister projects from n24q02m (click to expand)

Project	Tagline	Tag
better-code-review-graph	Knowledge graph for token-efficient code reviews -- semantic search and call-...	MCP
better-email-mcp	IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att...	MCP
better-godot-mcp	Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g...	MCP
better-notion-mcp	Markdown-first Notion for AI agents -- pages, databases, blocks, and comments...	MCP
better-telegram-mcp	Telegram for AI agents -- messages, chats, media, and contacts across both bo...	MCP
claude-plugins	Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea...	Marketplace
imagine-mcp	Image and video understanding + generation for AI agents -- across Gemini, Op...	MCP
jules-task-archiver	Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a...	Tooling
mcp-core	Shared foundation for building MCP servers -- Streamable HTTP transport, OAut...	MCP
mnemo-mcp	Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi...	MCP
qwen3-embed	Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF	Library
skret	Secrets without the server.	CLI
tacet	TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn...	Tooling
web-core	Shared web infrastructure package for search, scraping, HTTP security, and st...	Library
wet-mcp	Open-source MCP server for AI agents: web search, content extraction, and lib...	MCP

Features

Web Search -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with query expansion, TTL cache (1 h general / 5 min time-sensitive), standardized citation format, and 200-token snippet cap
Academic Research -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
Library Docs -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
Content Extract -- 5-strategy escalation chain via n24q02m-web-core ScrapingAgent (basic_http -> tls_spoof -> headless Crawl4AI), markitdown bridge for low-tier HTML/MD fallback, smart chunks structured output (clean text + markdown + JSON-LD + code blocks + metadata), batch processing (up to 50 URLs), deep crawling, site mapping
Local File Conversion -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
Media -- List + download images / videos / audio files. analyze deprecated v<auto>+ -- use imagine-mcp.understand for vision/audio inference
Anti-bot -- Stealth strategies bypass Cloudflare, Medium, LinkedIn, Twitter
Zero Config -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere) for higher-quality vectors
Sync -- Cross-machine sync of indexed docs via Google Drive (OAuth Device Code, no browser redirect)

Quick install

# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@n24q02m-plugins

# Method 1 (CLI): direct uvx invocation
claude mcp add wet -- uvx wet-mcp

# Method 3 (recommended for HTTP / multi-device / OAuth)
docker run -d --name wet-mcp-http -p 8084:8080 \
  -v wet-data:/data -e MCP_TRANSPORT=http \
  -e PUBLIC_URL=https://wet.example.com \
  n24q02m/wet-mcp:latest

Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/wet-mcp/setup/ and the paste-to-agent snippets at claude-plugins/plugins/wet-mcp/setup-with-agent.md (per Spec F single source of truth).

Status

2026-05-02 -- Architecture stabilization update

Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. As of v<auto>, the architecture is stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.

Apologies for the instability period. If you encountered issues with prior versions, please update to v<auto>+ and follow the current setup docs -- most prior workarounds are no longer needed.

Related plugins from the same author:

wet-mcp -- Web search + content extraction

mnemo-mcp -- Persistent AI memory

imagine-mcp -- Image/video understanding + generation

better-notion-mcp -- Notion API

better-email-mcp -- Email management

better-telegram-mcp -- Telegram

better-godot-mcp -- Godot Engine

better-code-review-graph -- Code review knowledge graph

All plugins share the same architecture (this spec) -- install once, learn pattern transfers.

Documentation

Full docs at mcp.n24q02m.com/servers/wet-mcp/setup/:

Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
Modes overview -- stdio / local-relay / remote-relay / remote-oauth
Multi-user setup -- per-JWT-sub credential model

In-repo references (Spec F single source of truth: setup docs live in claude-plugins/plugins/wet-mcp/):

docs/ARCHITECTURE.md -- web-core ScrapingAgent integration, strategy chain, storage layout, LLM provider dispatch
docs/BENCHMARKS.md -- v1.x baseline coverage / latency placeholders + tier-1 fixture metrics

Install with AI agent -- paste this to your AI coding agent:

Install MCP server wet-mcp following the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/wet-mcp/setup-with-agent.md

Tools

6 MCP tools (3 domain + config + help + config__open_relay). The legacy setup tool merged into config action dispatch.

Tool	Description
`search`	Web (SearXNG metasearch), news, images, academic research (Scholar / arXiv / PubMed / CrossRef / Semantic Scholar / BASE), library docs (HyDE + FTS5), find similar pages. Includes `docs_resolve` (library name -> ranked id), `docs_query` (version-aware + topic + 5000-token cap), `docs_lock_project` (Cabinets project pin via pyproject / package.json / go.mod / Cargo.toml manifest detection).
`extract`	URL -> smart chunks dict (`clean_text` + `markdown` + `structured_data` + `code_blocks` + `metadata`) via web-core 5-strategy chain. Batch processing (up to 50 URLs), deep crawling, site mapping, local file conversion (PDF/DOCX/XLSX/PPTX/EPUB), structured extraction (JSON Schema)
`media`	`list` (discover URLs from gallery pages), `download` (SSRF-safe). `analyze` deprecated v<auto>+ -- forwards to `imagine-mcp.understand`
`config`	`status`, `set`, `cache_clear`, `docs_reindex`, `warmup`, `setup_sync`, `setup_status`, `setup_skip`, `setup_reset`, `setup_complete`
`help`	Per-tool documentation: `search`, `extract`, `media`, `config`
`config__open_relay`	Re-trigger the zero-config relay setup flow (prints a fresh relay URL for the browser form). Registered via `mcp-core`'s `register_open_relay_tool` so an LLM can restart setup without a manual restart.

Media boundary: For vision / audio understanding (image captioning, OCR, audio transcription, video summarization), use imagine-mcp. media.analyze was removed in wet v2.0.0 -- use imagine-mcp.understand instead.

Comparison

How wet-mcp stacks up against direct competitors in each pillar:

Capability	wet-mcp	Brave Search	Tavily	Firecrawl	Context7
Web search	Yes (SearXNG aggregation)	Yes	Yes	No	No
Extract URL	Yes (5-strategy chain)	No	Yes (basic)	Yes	No
Media list / download	Yes	No	No	No	No
Library docs search	Yes (Tier 1 curated + Tier 2 on-demand, version-aware, Cabinets)	No	No	No	Yes
Academic research	Yes (6 providers)	No	No	No	No
Self-hostable	Yes	No	No	No	Yes
Free tier	Yes (open source)	Limited	Limited	Limited	Yes

Security

SSRF prevention -- URL validation on crawl targets
Graceful fallbacks -- Cloud → Local embedding, multi-tier crawling
Error sanitization -- No credentials in error messages
File conversion sandboxing -- Optional CONVERT_ALLOWED_DIRS restriction

Build from Source

git clone https://github.com/n24q02m/wet-mcp.git
cd wet-mcp
uv sync
uv run wet-mcp

Deploy to Cloudflare

Run your own single-user wet instance serverless on Cloudflare (Containers + D1 + Vectorize + KV).

Prerequisites: a Cloudflare account on the Workers Paid plan and the wrangler CLI.

git clone https://github.com/n24q02m/wet-mcp && cd wet-mcp
wrangler login

Provision resources and apply the D1 schema:

wrangler d1 create wet-docs
wrangler d1 execute wet-docs --file migrations/0001_init_wet.sql --remote
wrangler vectorize create wet-docs-vectors --dimensions 768 --metric cosine
wrangler kv namespace create wet-kv

Paste the returned IDs into wrangler.jsonc.

Push the container image to your Cloudflare managed registry (CF Containers cannot pull from external registries directly), then set <YOUR_ACCOUNT_ID> in wrangler.jsonc:

docker pull ghcr.io/n24q02m/wet-mcp:beta
docker tag ghcr.io/n24q02m/wet-mcp:beta wet-mcp:beta
wrangler containers push wet-mcp:beta   # prints registry.cloudflare.com/<ACCOUNT_ID>/wet-mcp:beta

Set secrets (use SEARXNG_URL with basic-auth userinfo, e.g. https://user:[email protected], or TAVILY_API_KEY if you set SEARCH_BACKEND=tavily):

wrangler secret put CREDENTIAL_SECRET
wrangler secret put JINA_AI_API_KEY
wrangler secret put GOOGLE_VERTEX_EXPRESS_API_KEY
wrangler secret put XAI_API_KEY
wrangler secret put MCP_RELAY_PASSWORD
wrangler secret put MCP_DCR_SERVER_SECRET
wrangler secret put SEARXNG_URL

wrangler deploy and complete setup in the browser relay form at your Worker domain.

Storage maps to Cloudflare via MCP_STORAGE_BACKEND=cf-kv (credentials/tokens, encrypted), DOCS_DB_BACKEND=cf-d1 (docs + BM25 full-text), and Vectorize (embeddings). Web search uses a SearXNG instance (SEARCH_BACKEND=searxng, SEARXNG_URL) or Tavily (SEARCH_BACKEND=tavily); embed/rerank are forced cloud via EMBEDDING_MODELS/RERANK_MODELS.

Trust Model

This plugin implements TC-Local (machine-bound, single trust principal). See mcp-core trust model for full classification.

Mode	Storage	Encryption	Who can read your data?
stdio (default)	`~/.wet-mcp/config.json`	AES-GCM, machine-bound key	Only your OS user (file perm 0600)
HTTP self-host	Same as stdio	Same	Only you (admin = user)

License

MIT -- See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 975 Commits
.claude-plugin		.claude-plugin
.github		.github
.jules		.jules
.pre-commit-hooks		.pre-commit-hooks
docs		docs
hooks		hooks
migrations		migrations
scripts		scripts
skills		skills
src		src
tests		tests
.coderabbit.yaml		.coderabbit.yaml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.infisical.json		.infisical.json
.mise.toml		.mise.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
bun.lock		bun.lock
docker-compose.cloudflare.yml		docker-compose.cloudflare.yml
docker-compose.http.yml		docker-compose.http.yml
docker-compose.yml		docker-compose.yml
glama.json		glama.json
package.json		package.json
pyproject.toml		pyproject.toml
renovate.json		renovate.json
server.json		server.json
tsconfig.json		tsconfig.json
uv.lock		uv.lock
vitest.config.ts		vitest.config.ts
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WET - Web Extended Toolkit MCP Server

Table of contents

Features

Quick install

Status

Documentation

Tools

Comparison

Security

Build from Source

Deploy to Cloudflare

Trust Model

License

About

Uh oh!

Releases 189

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

WET - Web Extended Toolkit MCP Server

Table of contents

Features

Quick install

Status

Documentation

Tools

Comparison

Security

Build from Source

Deploy to Cloudflare

Trust Model

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 189

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages