Decompose

Stop prompting. Start decomposing.

Deterministic text classification for AI agents. Decompose turns any text into classified, structured semantic units — instantly. No LLM. No setup. One function call.

Before: your agent reads this

The contractor shall provide all materials per ASTM C150-20. Maximum load
shall not exceed 500 psf per ASCE 7-22. Notice to proceed within 14 calendar
days of contract execution. Retainage of 10% applies to all payments.
For general background, the project is located in Denver, CO...

After: your agent reads this

[
  {
    "text": "The contractor shall provide all materials per ASTM C150-20.",
    "authority": "mandatory",
    "risk": "compliance",
    "type": "requirement",
    "irreducible": true,
    "attention": 8.0,
    "entities": ["ASTM C150-20"]
  },
  {
    "text": "Maximum load shall not exceed 500 psf per ASCE 7-22.",
    "authority": "prohibitive",
    "risk": "safety_critical",
    "type": "constraint",
    "irreducible": true,
    "attention": 10.0,
    "entities": ["ASCE 7-22"]
  }
]

Every unit classified. Every standard extracted. Every risk scored. Your agent knows what matters.

Install

pip install decompose-mcp

Use as MCP Server

Add to your agent's MCP config (Claude Code, Cursor, Windsurf, etc.):

{
  "mcpServers": {
    "decompose": {
      "command": "uvx",
      "args": ["decompose-mcp", "--serve"]
    }
  }
}

Your agent gets two tools:

decompose_text — decompose any text
decompose_url — fetch a URL and decompose its content

OpenClaw

Install the skill from ClawHub or configure directly:

{
  "mcpServers": {
    "decompose": {
      "command": "python3",
      "args": ["-m", "decompose", "--serve"]
    }
  }
}

Or install the skill: clawdhub install decompose-mcp

Use as CLI

# Pipe text
cat spec.txt | decompose --pretty

# Inline
decompose --text "The contractor shall provide all materials per ASTM C150-20."

# Compact output (smaller JSON)
cat document.md | decompose --compact

Use as Library

from decompose import decompose_text, filter_for_llm

result = decompose_text("The contractor shall provide all materials per ASTM C150-20.")

for unit in result["units"]:
    print(f"[{unit['authority']}] [{unit['risk']}] {unit['text'][:60]}...")

# Pre-filter for LLM context — keep only high-value units
filtered = filter_for_llm(result, max_tokens=4000)
print(f"{filtered['meta']['reduction_pct']}% token reduction")
llm_input = filtered["text"]  # Ready for your LLM

What Each Field Means

Field	Values	What It Tells Your Agent
`authority`	mandatory, prohibitive, directive, permissive, conditional, informational	Is this a hard requirement or background?
`risk`	safety_critical, security, compliance, financial, contractual, advisory, informational	How much does this matter?
`type`	requirement, definition, reference, constraint, narrative, data	What kind of content is this?
`irreducible`	true/false	Must this be preserved verbatim?
`attention`	0.0 - 10.0	How much compute should the agent spend here?
`entities`	standards, codes, regulations	What formal references are cited?
`actionable`	true/false	Does someone need to do something?

What to Build With This

Decompose is not the destination. It's the step before the LLM that most developers skip — not because it's hard, but because nobody showed them it exists. Documents have structure. That structure is classifiable. And classification should happen before reasoning.

Without:  document → chunk → embed → retrieve → LLM → answer  (100% of tokens)
With:     document → decompose → filter/route → LLM → answer  (20-40% of tokens)

Filter: built-in LLM pre-filter

filter_for_llm() keeps mandatory, safety-critical, financial, and compliance units — drops boilerplate before it reaches your LLM or vector store.

from decompose import decompose_text, filter_for_llm

result = decompose_text(open("contract.md").read())
filtered = filter_for_llm(result, max_tokens=4000)

# filtered["text"] = high-value units only, ready for LLM
# filtered["meta"]["reduction_pct"] = how much was dropped (typically 60-80%)

# Or use the units directly for embedding
for unit in filtered["units"]:
    embed_and_store(unit["text"], metadata={
        "authority": unit["authority"],
        "risk": unit["risk"],
        "attention": unit["attention"],
    })

Route: risk-based processing

Safety-critical content goes to one chain. Financial content goes to another. Boilerplate gets skipped.

from decompose import decompose_text

result = decompose_text(spec_text)

for unit in result["units"]:
    if unit["risk"] == "safety_critical":
        safety_chain.process(unit)       # Full analysis + human review
    elif unit["risk"] == "financial":
        audit_chain.process(unit)         # Flag for finance team
    elif unit["attention"] < 0.5:
        pass                              # Skip boilerplate
    else:
        general_chain.process(unit)       # Standard LLM analysis

Measure: token cost reduction

from decompose import decompose_text

result = decompose_text(spec_text)
total = len(result["units"])
high = [u for u in result["units"] if u["attention"] >= 1.0]

print(f"{len(high)}/{total} units need LLM analysis")
print(f"{100 - len(high) * 100 // total}% token reduction")

See examples/ for runnable scripts.

Why No LLM?

Decompose runs on pure regex and heuristics. No Ollama, no API key, no GPU, no inference cost.

This is intentional:

Fast: <500ms for a 50-page spec
Deterministic: Same input always produces same output
Offline: Works air-gapped, on a plane, on CI
Composable: Your agent's LLM reasons over the structured output — decompose handles the preprocessing

The LLM is what your agent uses. Decompose makes whatever model you're running work better.

Built by Echology

Decompose is built by Echology and extracted from AECai, a document intelligence platform for Architecture, Engineering, and Construction firms. The classification patterns, entity extraction, and irreducibility detection are battle-tested against thousands of real AEC documents — specs, contracts, RFIs, inspection reports, pay applications.

Decompose earned its independence — it started as AECai's text classification module, proved general enough to work across domains (insurance, trading, regulatory), and was released standalone. Free, MIT-licensed.

Blog

When Regex Beats an LLM — Decompose classifies the MCP spec in 3.78ms
Why Your Agent Needs a Cognitive Primitive — attention scoring, irreducibility, and routing
What "Simulation-Aware" Actually Means — the architecture behind AECai

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.github		.github
benchmarks		benchmarks
docs		docs
examples		examples
lab		lab
marketing		marketing
skills/decompose-mcp		skills/decompose-mcp
src/decompose		src/decompose
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
ECHOLOGY.md		ECHOLOGY.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
TASKS.md		TASKS.md
pyproject.toml		pyproject.toml
server.json		server.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decompose

Before: your agent reads this

After: your agent reads this

Install

Use as MCP Server

OpenClaw

Use as CLI

Use as Library

What Each Field Means

What to Build With This

Filter: built-in LLM pre-filter

Route: risk-based processing

Measure: token cost reduction

Why No LLM?

Built by Echology

Blog

About

Uh oh!

Releases 2

Packages

Contributors 3

Uh oh!

Languages

License

echology-io/decompose

Folders and files

Latest commit

History

Repository files navigation

Decompose

Before: your agent reads this

After: your agent reads this

Install

Use as MCP Server

OpenClaw

Use as CLI

Use as Library

What Each Field Means

What to Build With This

Filter: built-in LLM pre-filter

Route: risk-based processing

Measure: token cost reduction

Why No LLM?

Built by Echology

Blog

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Uh oh!

Languages

Packages