Every feature ships with proof. This document is the complete audit trail of verification runs, test output, and deployment evidence.
| Category | Evidence |
|---|---|
| Pre-action checks | Block known mistakes before tool use β tested with real feedback patterns |
| Feedback capture | Up/down signals with context, tags, rubric scores β schema-validated |
| Prevention rules | Auto-promoted from repeated failures β regression-tested |
| Prompt/workflow evals | Feedback-derived eval suites prove whether prompt and workflow behavior improved |
| Filesystem search | ContextFS files searchable without embeddings β regression-tested |
| Social analytics | Multi-platform polling pipeline β regression-tested |
| ThumbGate search | Two-tier search (MCP tool + REST API) β regression-tested |
| MCP/API parity | Every MCP tool has a matching REST endpoint β proven by OpenAPI parity tests |
| CI pipeline | All PRs require green CI (tests + CodeQL + GitGuardian + Socket Security) |
| Railway deployment | Auto-deploy on merge, SHA-verified, health-checked |
git clone https://github.com/IgorGanapolsky/thumbgate.git
cd thumbgate && npm ci
npm test # full repository suite
npm run prove:adapters # Adapter compatibility proof
npm run prove:automation # Automation proof harness
npm run eval:feedback # Feedback-derived prompt/workflow eval proof
npm run test:coverage # Coverage report# Free tier β any LLM invokes search_thumbgate via MCP
# Tool: search_thumbgate { query: "database mock", source: "all" }
# Paid tier β authenticated REST API
curl -H "Authorization: Bearer YOUR_KEY" \
"https://thumbgate-production.up.railway.app/v1/search?q=test+failure"Scope:
- Added explicit machine-readable buyer actions to the public landing page and the generated docs landing page so AI parsers and buyers can distinguish the install path, Pro checkout path, and Workflow Hardening Sprint intake path.
- Added a dedicated
ServiceJSON-LD entity for the Workflow Hardening Sprint offer. - Tightened regression coverage around landing-page structured data and customer-discovery artifact documentation.
- Kept operator-only runtime state out of tracked outputs.
Commands run in the dedicated clean verification worktree at /Users/ganapolsky_i/.codex/worktrees/verify-revenue-loop-offer-schema-20260429/ThumbGate:
npm ci
npm test
npm run test:coverage
THUMBGATE_PROOF_DIR="$(mktemp -d)/proof" npm run prove:adapters
THUMBGATE_AUTOMATION_PROOF_DIR="$(mktemp -d)/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved result:
npm ciexited0.npm testexited0.npm run test:coverageexited0with all-files coverage at:87.16lines72.57branches88.76functions
THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith6/6 healthychecks.git diff --checkexited0.- Targeted regression probes for the changed surfaces also passed before the clean-lane run:
node --test tests/public-landing.test.jsnode --test tests/api-server.test.jsnode --test tests/customer-discovery-sprint.test.js
- No tracked
.thumbgate/*or.claude/*runtime artifacts were added or modified by this change.
Requirements verified:
- Public and generated landing surfaces now advertise the three buyer paths in machine-readable form.
- The Workflow Hardening Sprint offer is represented as a distinct service entity for search/indexing and buyer parsing.
- Regression coverage now protects both the schema additions and the operator handoff artifact documentation that supports outreach execution.
April 9, 2026: technical debt audit follow-through for stale-claim cleanup, docs hygiene regression coverage, and protected-system revalidation
Scope:
- Audited active operator-facing docs, launch copy, and the landing page for brittle hardcoded verification counts and stale product-surface metrics.
- Replaced those claims with evergreen proof-backed wording.
- Added
tests/docs-claim-hygiene.test.jsto block future reintroduction of exact-count drift in selected active docs. - Updated
tests/public-landing.test.jsandpackage.jsonso the new hygiene check is enforced intest:workflow. - Revalidated protected systems: ContextFS/RAG, orchestration, coverage, proof harnesses, and self-heal health.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-decision-learning-loop-20260409:
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:check
git diff --checkObserved result:
npm testexited0.npm run test:coverageexited0with all-files coverage at:90.23lines76.74branches93.45functions
npm run prove:adaptersexited0:48passed,0failed.npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith6/6 healthychecks.git diff --checkexited0.- Repository inventory delta for this pass, excluding
.gitandnode_modules:- files before:
874 - files after:
875 - lines before:
224174 - lines after:
224216
- files before:
- The one new file added in this pass is
tests/docs-claim-hygiene.test.js. - No tracked RAG/runtime artifacts were added or modified;
.thumbgate/*and.claude/*outputs remained local-only. - The latest branch-level GitHub Actions failure observed before PR creation was policy-only:
release_sensitive_changes_require_pr. That blocker indicates the release-sensitive integrity rule requires an open PR; it does not indicate a failing local verification suite.
Requirements verified:
- Active docs and landing surfaces no longer rely on brittle hardcoded verification counts.
- A regression test now protects those claim surfaces from drifting again.
- Protected systems remained operational after the cleanup pass.
- Repo-wide coverage remains below
100%, so the requested 100% coverage target is still unmet and must not be claimed.
April 6, 2026: technical debt audit hardening for Node 20 coverage, deterministic Pro-gate tests, and CI gate completeness
Scope:
- Hardened
scripts/test-coverage.jsto feature-detect--test-coverage-include/--test-coverage-excludebefore using them. - Added regression coverage in
tests/test-coverage.test.jsfor runtimes with and without those Node coverage flags. - Refactored
scripts/pro-features.jsso Pro-gated tests can inject license predicates and output sinks instead of depending on operator-local saved license state. - Hardened
scripts/multi-hop-recall.js,scripts/synthetic-dpo.js,tests/license.test.js,tests/multi-hop-recall.test.js, andtests/synthetic-dpo.test.jsso the unlicensed path stays deterministic in CI. - Added
npm run budget:statusandnpm run test:coverageto.github/workflows/ci.yml, withtests/deployment.test.jsenforcing that workflow contract. - Removed the tracked runtime artifact
.claude/context-engine/quality-log.jsonand kept it ignored via.gitignore. - Updated
AGENTS.md,CLAUDE.md,GEMINI.md, anddocs/TECHNICAL_DEBT_AUDIT.mdwith the new prevention rules and audit evidence.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/audit-worktrees/thumbgate-audit-20260406c:
npm run feedback:stats --silent
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved result:
npm run feedback:stats --silentexited0withtotal=51,positive=3,negative=48,trend=stable.npm testexited0.npm run test:coverageexited0with all-files coverage at:90.26lines76.57branches93.73functions
THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith6/6 healthychecks.git diff --checkremained clean after reverting generated drift fromprimer.mdandconfig/skill-packs/react-testing.json.- The pre-fix failure was reproduced before the hardening work: on Node 20,
npm run test:coverageexited with/opt/homebrew/Cellar/node@20/20.20.1/bin/node: bad option: --test-coverage-include. - No tracked ThumbGate memory artifacts were added or modified by this audit; the only tracked file removed was the generated runtime log
.claude/context-engine/quality-log.json. - Revalidated after rebasing the audit commit onto
origin/mainat05641e599aa60ae69326567c369c7edfc38f39b5; the rebased branch remained locally green with the same proof counts and a healthyself-heal:check.
Scope:
- Added
scripts/memory-firewall.jsas the single ingress decision point for feedback/memory writes, with provider selection forauto,shieldcortex,local, andoff. - Added
scripts/shieldcortex-memory-firewall-runner.mjsso the gateway can use the optionalshieldcortexpackage without making it a hard runtime dependency. - Hardened
scripts/feedback-loop.jsto block secret-bearing feedback before any raw write tofeedback-log.jsonlormemory-log.jsonl, while recording only redacted diagnostics. - Replaced the stale runtime source label
shieldcortexinscripts/context-engine.jswith the truthful live storage labelsjsonl-memoryandlancedb-vectors. - Added regression coverage in
tests/feedback-loop.test.jsandtests/intelligence.test.js. - Documented the optional ingress firewall controls in
README.mdand.env.example. - Added
shieldcortexas an optional dependency, not a required runtime dependency.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/thumbgate/.worktrees/fix-thumbgate-source-labels:
npm ci
node --check scripts/memory-firewall.js
node --check scripts/feedback-loop.js
node - <<'PY'
const { evaluateMemoryIngress } = require('./scripts/memory-firewall');
(async () => {
const decision = await evaluateMemoryIngress({
feedbackEvent: {
signal: 'down',
context: 'Accidentally pasted anthropic API key sk-ant-api03-abcdefghijklmnopqrstuvwxyz1234567890 into feedback.'
},
memoryRecord: {
title: 'Dangerous memory',
text: 'anthropic api key sk-ant-api03-abcdefghijklmnopqrstuvwxyz1234567890 leaked'
},
options: { provider: 'shieldcortex', mode: 'strict' }
});
console.log(JSON.stringify({
allowed: decision.allowed,
provider: decision.provider,
mode: decision.mode,
degraded: decision.degraded,
reason: decision.reason,
threatIndicators: decision.threatIndicators,
blockedPatterns: decision.blockedPatterns
}, null, 2));
})();
PY
node --test tests/feedback-loop.test.js tests/intelligence.test.js
npm test >/tmp/mcp_npm_test_fix_thumbgate_source_labels.log 2>&1
npm run test:coverage >/tmp/mcp_test_coverage_fix_thumbgate_source_labels.log 2>&1
THUMBGATE_PROOF_DIR=/tmp/mcp_proof_adapters_fix_thumbgate npm run prove:adapters >/tmp/mcp_prove_adapters_fix_thumbgate.log 2>&1
THUMBGATE_AUTOMATION_PROOF_DIR=/tmp/mcp_proof_automation_fix_thumbgate npm run prove:automation >/tmp/mcp_prove_automation_fix_thumbgate.log 2>&1
npm run self-heal:check >/tmp/mcp_self_heal_check_fix_thumbgate.log 2>&1
git diff --checkObserved result:
npm ciexited0:added 296 packagesandfound 0 vulnerabilities.node --check scripts/memory-firewall.jsexited0.node --check scripts/feedback-loop.jsexited0.- The direct ShieldCortex ingress probe exited
0and returned:allowed: falseprovider: "shieldcortex"mode: "strict"reason: "Blocked: credential leak detected (anthropic api_key)"threatIndicators: ["credential_leak"]
node --test tests/feedback-loop.test.js tests/intelligence.test.jsexited0:74passed,0failed.npm testexited0on the patched worktree (/tmp/mcp_npm_test_fix_thumbgate_source_labels.log).- The full-suite rerun included the new ThumbGate/security checks:
evaluateMemoryIngress: ShieldCortex blocks secret-bearing payload when explicitly enabledcaptureFeedback: blocks secret-bearing feedback before any raw memory write
npm run test:coverageexited0with all-files coverage at:89.71lines75.40branches93.21functions
THUMBGATE_PROOF_DIR=/tmp/mcp_proof_adapters_fix_thumbgate npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=/tmp/mcp_proof_automation_fix_thumbgate npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4 healthychecks.git diff --checkexited0.
Evidence caveat:
- The
prove:adaptersandprove:automationcommands in this repo are test harnesses (node --test ...), not a fresh tracked-artifact publisher. - The current run proved those contracts through the green harness output and exit codes above.
- The checked-in
proof/compatibility/report.jsonandproof/automation/report.jsonfiles still carry oldergeneratedAttimestamps, so they must not be claimed as freshly regenerated evidence for this specific run.
Requirements verified:
- Secret-bearing or hostile feedback can now be blocked before any raw memory promotion write.
- When ShieldCortex is installed, the ingress firewall can use it directly; when it is absent, the gateway falls back to the local secret scanner without breaking runtime operation.
- The runtime memory manifest no longer falsely claims
shieldcortexas a live memory source. - The change did not break the ThumbGate feedback loop, adapter contracts, automation contracts, or self-healing checks.
March 21, 2026: Social publish hardening + self-heal reliability fix + archive-WIP retirement decision
Scope:
- Hardened
scripts/self-healing-check.jsso the standard health gate survives large command output and givesprove:automationits own isolated proof directory. - Hardened
scripts/social-pipeline.jsso copied-profile Chrome automation now retries temp-profile cleanup, waits longer for DevTools startup, reports TikTok preflight timeouts as an authenticated-upload-surface failure, and dismisses Instagram's discard-confirmation modal while advancing to the draft editor. - Confirmed the archived local WIP commit
2063a6e57a37663603245298716c24dd32de0982remains intentionally unshipped because it deletesscripts/behavioral-extraction.js, adds a scratch verifier, and diverges from the hardened social-pipeline lane.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-social-archive-recovered:
npm ci
node --check scripts/self-healing-check.js
node --test tests/self-healing-check.test.js
node --check scripts/social-pipeline.js
node --test tests/social-pipeline.test.js tests/social-marketing-assets.test.js
for db in "$HOME/Library/Application Support/Google/Chrome"/*/Cookies; do profile=$(basename "$(dirname "$db")"); ig=$(sqlite3 "$db" "select count(*) from cookies where host_key like '%instagram%';" 2>/dev/null || echo err); tt=$(sqlite3 "$db" "select count(*) from cookies where host_key like '%tiktok%';" 2>/dev/null || echo err); echo "$profile instagram=$ig tiktok=$tt"; done
npm run social:publish -- \
--bundle .artifacts/social/live-combined-preflight-proof-20260321c/bundle.json \
--platforms instagram,tiktok \
--no-share \
--cleanup-drafts \
--backend playwright \
--profile-dir Default \
--headless
npm run social:publish -- \
--bundle .artifacts/social/live-combined-preflight-proof-20260321c/bundle.json \
--platforms instagram \
--no-share \
--cleanup-drafts \
--backend playwright \
--profile-dir Default
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --check
git cherry -v main codex/archive-primary-dirty-20260320
git show --stat --summary --format=medium 2063a6e57a37663603245298716c24dd32de0982Observed result:
npm ciexited0.node --check scripts/self-healing-check.jsexited0.node --test tests/self-healing-check.test.jsexited0:15passed,0failed.node --check scripts/social-pipeline.jsexited0.node --test tests/social-pipeline.test.js tests/social-marketing-assets.test.jsexited0:19passed,0failed.npm testexited0.- The first
npm run test:coveragerun exposed a full-suite CLI handshake timeout intests/cli.test.js; widening the helper timeout from10sto20sremoved that flake. The rerun then exited0with all-files coverage at88.01lines,75.59branches, and92.54functions. THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks (budget_status 150ms,tests 61973ms,prove_adapters 1151ms,prove_automation 1159ms).git diff --checkexited0.- The Chrome cookie scan on this machine showed:
Default instagram=7 tiktok=0Profile 1 instagram=0 tiktok=0
- The prepared bundle at
.artifacts/social/live-combined-preflight-proof-20260321c/contains exactly5slide PNGs,instagram.txt,tiktok.txt,tiktok-fallback.mp4, and attempt-proof subdirectories. sipsverified.artifacts/social/live-combined-preflight-proof-20260321c/slides/slide-01.pngat1080x1080.- The bundle manifest recorded these immutable hashes:
- caption SHA-256:
834ec9b32f36d082998cd74a1af3c1ce50fc1ec568f83415c1196d0b2d489e44 - TikTok fallback MP4 SHA-256:
1d9ab0a7cb237e750907c88b68eed1d0d909269a858deb90fa65f7a86260d693 - slide SHA-256 values:
d2dfd30faefab16a2e5280a35233d761a4b45ee27c611ba1120abc817071bb7c,3b87cce4a311d8a77b005ed4c98b98988a60213514b4b576edc7dd49f9e86eac,ed1c8bef64842219744f1146693ed02335812182ec94fcafb41aa37c5de6eb9c,0d5dcbacdafecff4c73f035286e9e852e16adfb6de55ebc519149e9550a92a71,aaf7b3c8a0e97fd9d2fa02b6d65ad0a2e586bcee577eae889954810df3b458fb.
- caption SHA-256:
- The combined headless publish lane halted before any partial post with:
TikTok did not reach an authenticated upload surface: {"error":"Timed out waiting for browser state on https://www.tiktok.com/tiktokstudio/"}
- The Instagram-only no-share publish lane succeeded on the same prepared bundle:
- CLI result:
[{"platform":"instagram","mode":"draft-ready","assetCount":5}] - Attempt record:
.artifacts/social/live-combined-preflight-proof-20260321c/publish-attempts/instagram-1774117555400-pccxyr/attempt.json - Attempt screenshots:
instagram-preflight.png,instagram-uploaded.png,instagram-draft-ready.png
- CLI result:
- The latest local publish-history rows show the repaired sequence truthfully:
- earlier copied-profile failures (
playwright-coremissing, DevTools startup budget too short, generic TikTok timeout, Instagram discard modal) - final successful Instagram draft-ready row for attempt
instagram-1774117555400-pccxyr
- earlier copied-profile failures (
git cherry -v main codex/archive-primary-dirty-20260320showed one unique archive commit,2063a6e....git show --stat --summary --format=medium 2063a6e...proved the archive commit is not a safe promotion candidate:- deletes
scripts/behavioral-extraction.js - adds scratch-only
scripts/gsd-final-verification.js - changes
bin/memory.sh,bin/obsidian-sync.sh,primer.md, and addsdocs/OPERATIONAL_LOOPS.mdwithout a coherent verification lane
- deletes
- A stronger live browser proof was attempted with
npm run social:publish -- --bundle .artifacts/social/pre-action-checks-proof/bundle.json --platforms instagram --no-share --cleanup-drafts. The repo-side tab-focus bug was fixed first, but the live attempt still failed outside repo control because Google Chrome returned:Executing JavaScript through AppleScript is turned off. To turn it on, from the menu bar, go to View > Developer > Allow JavaScript from Apple Events. npm audit --jsonexited0with0vulnerabilities.git diff --checkexited0.
Requirements verified:
- The product now has a low-debt, repo-owned zero-filming social pipeline instead of an external-only posting playbook.
- Instagram and TikTok assets are generated from one canonical local source, so the same content can be repurposed without manual screenshots or duplicate copy.
- TikTok web fallback is explicit and truthful: the automation path generates a
1080x1920MP4 because the current TikTok desktop surface acceptsvideo/*rather than a guaranteed photo-carousel path. - Scheduler support is implemented and proven in dry-run form, but should be installed only from a durable checkout path because the generated
launchdplist points at the installing repo path. - The remaining live-posting blocker is a browser runtime setting in Google Chrome, not missing repo logic.
- No new npm dependencies were added.
Scope:
- Repositioned the public landing page away from generic "memory server" framing and toward "AI workflow control plane" language, while preserving the existing Pre-Action Checks product contract.
- Added a comparison section that clarifies the difference between memory servers, agentic RAG, and the workflow control layer this product actually sells.
- Surfaced semantic cache efficiency metrics in the dashboard/API by reusing existing ContextFS provenance (
contextfs/provenance/packs.jsonl) rather than introducing a new ledger or duplicate write path. - Updated the commercial truth copy so the Pro package promises concrete efficiency metrics: semantic cache hit rate and reused context tokens.
- Fixed the required session handoff hook by cleaning
bin/obsidian-sync.shand adding a regression test so./bin/memory.shexits cleanly when no Obsidian vault env is configured.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-llm-efficiency-roi:
npm ci
node --test tests/dashboard.test.js
node --test tests/api-server.test.js
node --test tests/public-landing.test.js
node --test tests/session-handoff.test.js
./bin/memory.sh
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp" npm run prove:automation
npm run self-heal:check
git diff --check
npm run revenue:status -- --jsonObserved result:
npm ciexited0.node --test tests/dashboard.test.jsexited0:17passed,0failed.node --test tests/api-server.test.jsexited0:55passed,0failed.node --test tests/public-landing.test.jsexited0:12passed,0failed.node --test tests/session-handoff.test.jsexited0:10passed,0failed../bin/memory.shexited0and now cleanly reportsTHUMBGATE_OBSIDIAN_VAULT_PATH not set. Skipping sync.with no shell errors.npm testexited0.npm run test:coverageexited0with all-files coverage at89.68lines,75.72branches, and93.16functions.THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks on a serial rerun (tests 99171ms,prove_adapters 7235ms,prove_automation 3873ms).git diff --checkexited0.npm run revenue:status -- --jsonexited0withsource: hosted-via-railway-env; public probes returned/health 200,/ 200, and/v1/telemetry/ping 204.
Behavioral proof points:
- The public landing page now explicitly says the product acts as an "AI workflow control plane" and is "not another generic memory server."
- The new category section contrasts
Memory servers,Agentic RAG, andThumbGate, so buyers can place the product correctly before evaluating pricing. - The Pro offer now exposes concrete efficiency metrics in public copy: semantic cache hit rate and reused context tokens.
- The session handoff path is cleaner than before:
./bin/memory.shrefreshesprimer.mdwithout the broken shell-comment noise that previously leaked frombin/obsidian-sync.sh. generateDashboard()now computes efficiency from existing context-pack provenance:contextPackRequestssemanticCacheHitssemanticCacheHitRateaverageSemanticSimilarityestimatedContextCharsReusedestimatedContextTokensReused
/v1/dashboardnow returns those efficiency metrics alongside funnel and revenue analytics, and regression tests verify the API contract.- Hosted-first operational truth still reports
bookedRevenueTodayCents: 0, so this change improves positioning and measurement clarity, not current-day revenue by itself.
No-tech-debt notes:
- No new dependencies were added.
- No new runtime ledger was introduced.
- Efficiency reporting reuses existing ContextFS provenance instead of creating a second telemetry path.
Scope:
- Removed
--detachfrom the Railway deploy commands in.github/workflows/ci.ymland.github/workflows/deploy-railway.ymlso GitHub Actions waits for the Railway build stream instead of queueing a deploy and immediately polling the old live app. - Increased the health-verification wait budget from
8attempts to12in CI and to18in the main deploy workflow so rollout activation has enough time to promote the new build before the SHA check fails. - Added regression coverage in
tests/deployment.test.jsto lock the workflow contract: no detached Railway deploys, stamped build metadata, and a longer SHA-verification budget. - Verified the root cause against production: merge commit
ebd5189d290b73c24b6b9cdc9f5181042e225171eventually reached Railway successfully even though workflow run23355558359failed early while/healthstill reported the previous build SHA.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-railway-verifier-wait:
npm ci
node --test tests/deployment.test.js
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --check
gh run view 23355558359 --repo IgorGanapolsky/thumbgate --log-failed
curl -sS https://thumbgate-production.up.railway.app/health
sleep 90 && curl -sS https://thumbgate-production.up.railway.app/healthObserved result:
npm ciexited0.node --test tests/deployment.test.jsexited0:14passed,0failed.npm testexited0.npm run test:coverageexited0with all-files coverage at89.68lines,75.72branches, and93.16functions.THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.git diff --checkexited0.- Root-cause evidence from GitHub Actions run
23355558359:- the deploy step stamped
config/build-metadata.jsonwithebd5189d290b73c24b6b9cdc9f5181042e225171; - the Railway deploy queued successfully and returned build logs for deployment
f5e4a9ab-92a5-41b9-9537-8862d529c4c3; - the health verifier then polled the live app too early and saw the still-healthy previous build
93daccdd7f5ac7efa3bf53e75d90b854976cb337for all8/8attempts.
- the deploy step stamped
- Live production proof after the failed workflow showed the new build did eventually promote:
- immediate
/health:buildSha: 93daccdd7f5ac7efa3bf53e75d90b854976cb337 - after ~90 seconds:
buildSha: ebd5189d290b73c24b6b9cdc9f5181042e225171
- immediate
Requirements verified:
- The broken
mainsignal was a false-negative workflow verifier, not a failure of the immutable build-metadata implementation or the Railway rollout itself. - The workflow repair is low debt: it removes premature detachment, increases the rollout wait budget, and locks the contract with tests instead of adding ad hoc manual retries.
Scope:
- Added
scripts/build-metadata.jsplus trackedconfig/build-metadata.jsonso deploys stamp an immutable build SHA into the shipped artifact instead of trusting mutable Railway runtime variables. - Updated
.github/workflows/deploy-railway.ymlto generate build metadata during the deploy workflow and removed the oldTHUMBGATE_BUILD_SHAruntime-variable sync. - Updated
src/api/server.jsso/healthreads the stamped build metadata and protected endpoints acceptx-api-keyas an alternate auth header in addition toAuthorization: Bearer .... - Added regression coverage in
tests/api-server.test.jsandtests/deployment.test.jsfor stamped build metadata, alternate auth headers, and the public server-card schema contract. - Added the missing empty-object
inputSchematoget_reliability_rulesinscripts/tool-registry.jsso Smithery and other directory scanners can enumerate the tool list without schema errors.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-immutable-buildsha:
npm ci
node --test tests/api-server.test.js tests/deployment.test.js tests/mcp-server.test.js
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved result:
npm ciexited0.node --test tests/api-server.test.js tests/deployment.test.js tests/mcp-server.test.jsexited0:86passed,0failed.npm testexited0.npm run test:coverageexited0with all-files coverage at89.67lines,75.73branches, and93.14functions.THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.- Local targeted proof confirmed the new behavior directly:
/healthreturned the stampedbuildShafrom the metadata file instead of reading a mutable env var.- admin-protected endpoints accepted
x-api-keyas an alternate auth header. - every public MCP tool entry exposed an
inputSchemaobject, includingget_reliability_rules.
git diff --checkexited0.
Requirements verified:
- Railway deploy proof can now compare
/health.buildShaagainst the actual shipped revision instead of a mutable runtime variable that can drift across deploys. - Smithery and other public MCP scanners now have a complete
inputSchemafor every exposed tool. - Header-based API key clients can authenticate without having to reformat credentials into bearer-token syntax.
- The fix is low debt: no new dependencies, no duplicate health endpoint, and no product-runtime feature fork.
Scope:
- Added
scripts/revenue-status.jsand repointednpm run revenue:statusto prefer hosted Railway-backed truth before falling back to the local CFO summary. - Preserved the old local-only operator path as
npm run revenue:status:localinstead of deleting it, so the change removes a blind spot without breaking existing local workflows. - Added targeted regression coverage in
tests/revenue-status.test.jsfor GitHub variable parsing, public landing signal detection, hosted diagnosis, and the hosted-audit happy path. - Set Railway production runtime vars
THUMBGATE_PUBLIC_APP_ORIGINandTHUMBGATE_BILLING_API_BASE_URLexplicitly to the canonical hosted origin so the deployed app no longer relies on implicit defaults. - Verified the live public app, live telemetry ingress, live hosted billing summary, and the repo-standard verification suite from a dedicated clean worktree.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-analytics-revenue-audit:
npm ci
node --check scripts/revenue-status.js
node --test tests/revenue-status.test.js
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
npm run revenue:status -- --json
railway variable set -s thumbgate -e production \
THUMBGATE_PUBLIC_APP_ORIGIN=https://thumbgate-production.up.railway.app \
THUMBGATE_BILLING_API_BASE_URL=https://thumbgate-production.up.railway.app --json
git diff --checkObserved result:
npm ciexited0.node --check scripts/revenue-status.jsexited0.node --test tests/revenue-status.test.jsexited0:4passed,0failed.npm testexited0.npm run test:coverageexited0with all-files coverage at89.37,75.65, and92.99on the tool's aggregate line.THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.npm run revenue:status -- --jsonexited0withsource: hosted-via-railway-env.- The live public probes inside
npm run revenue:status -- --jsonreported:/healthreturned200withversion: 0.7.4./healthdeployment metadata matched the canonical hosted origin for bothappOriginandbillingApiBaseUrl./returned200and exposedplausibleScript: true,telemetryEndpoint: true, andworkflowSprintIntake: true./still exposedgaEventHook: truebutgaLoaderScript: false, which matches the known GA runtime gap rather than a broken page./v1/telemetry/pingaccepted the live probe and returned204.
- The hosted Railway-backed billing summary reported truthfully:
today:13visitors,8page views,2checkout starts,2unique leads,2paid orders,$20.00booked historical revenue, and$0.00booked today.30d:28visitors,19page views,9checkout starts,6unique leads,2paid orders, and$20.00booked revenue.lifetime: the same counts as30dat verification time.dataQuality.telemetryCoverage,dataQuality.attributionCoverage, anddataQuality.amountKnownCoveragewere all1.diagnosis.primaryIssuewasoperator_blind_spot_local_fallback, not missing analytics or missing revenue data.
- Live runtime presence in Railway reported:
THUMBGATE_FEEDBACK_DIR: trueTHUMBGATE_API_KEY: trueTHUMBGATE_PUBLIC_APP_ORIGIN: trueTHUMBGATE_BILLING_API_BASE_URL: trueTHUMBGATE_GA_MEASUREMENT_ID: falseTHUMBGATE_CHECKOUT_FALLBACK_URL: trueSTRIPE_SECRET_KEY: true
- Live container inspection over Railway SSH confirmed durable analytics persistence under
/data/feedback:telemetry-pings.jsonlexisted and contained850lines.funnel-events.jsonlexisted and contained6lines.
railway variable set ... --jsonsucceeded forTHUMBGATE_PUBLIC_APP_ORIGINandTHUMBGATE_BILLING_API_BASE_URL, and Railway redeployed the production service with the explicit canonical values.git diff --checkexited0.
Requirements verified:
- Production analytics and tracking are implemented and live; the earlier zeroed local output was a local operator fallback, not evidence that nobody uses the system.
- Production revenue evidence exists and is queryable from the hosted admin surface; the truthful statement for March 20, 2026 is still
$0.00booked today and$20.00booked historically. - The no-tech-debt path is in place: the repo now has a hosted-first operator audit command with no new dependencies, no duplicate billing logic, and a preserved local fallback.
- The remaining live gap is external configuration, not product logic: GA4 is still missing a Railway
THUMBGATE_GA_MEASUREMENT_ID, so the page exposes GA hooks but does not load the GA script.
Scope:
- Replaced the single-shot Railway health check in
.github/workflows/ci.ymlwith bounded retry logic so transient cold-start502responses do not fail a healthy deployment. - Replaced the same brittle single-shot health check in
.github/workflows/deploy-railway.ymlwith the same bounded retry logic and response-body logging. - Hardened post-deploy verification without changing the actual production app contract.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-fix-prod-analytics:
node --test tests/deployment.test.js tests/deploy-policy.test.js
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved result:
node --test tests/deployment.test.js tests/deploy-policy.test.jsexited0:17passed,0failed.npm testexited0.npm run test:coverageexited0with all-files coverage at89.49%lines,75.89%branches, and93.11%functions.THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.git diff --checkexited0.- Root-cause proof from the failed post-merge deploy run on
main:- Railway variable sync timed out once at
https://backboard.railway.com/graphql/v2, then succeeded on rerun. - The remaining deploy failure was the health verifier receiving a transient
502fromhttps://thumbgate-production.up.railway.app/healthafter a single 30-second wait. - The production app still reported healthy via
/healthzwith durable feedback paths under/data/feedback.
- Railway variable sync timed out once at
Requirements verified:
- Deployment verification now retries through transient warmup responses instead of failing on a single
502. - The hardening is isolated to workflow verification logic; no product-runtime behavior changed.
Scope:
- Hardened
scripts/intent-router.jsso thestrict_reviewerpartner profile deterministically front-loads evidence-producing actions (construct_context_pack,context_provenance) ahead ofevaluate_context_packwhenverificationModeisevidence_first. - Added regression coverage in
tests/intent-router.test.jsfor the evidence-first ordering contract without overconstraining the relative order between the two evidence producers. - Re-ran the full required verification suite after GitHub CI exposed the probabilistic ordering bug.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-fix-prod-analytics:
node --test tests/intent-router.test.js
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved result:
node --test tests/intent-router.test.jsexited0:21passed,0failed.npm testexited0.npm run test:coverageexited0with all-files coverage at89.49%lines,75.89%branches, and93.05%functions.THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.git diff --checkexited0.
Requirements verified:
- The evidence-first reviewer strategy now matches its contract under repeated runs instead of relying on Thompson-sampling luck.
- The original PR failure mode is closed:
evaluate_context_packno longer outranks the evidence-producing actions for strict-reviewer incident plans.
Scope:
- Added a Railway-aware default in
scripts/feedback-loop.jsso hosted deployments automatically persist telemetry underRAILWAY_VOLUME_MOUNT_PATH/feedbackwhenTHUMBGATE_FEEDBACK_DIRis not explicitly set. - Fixed
scripts/billing.jsso hosted Stripe checkout session creation omitscustomer_emailunless a real email is present, instead of passingnulland triggering live Stripe API failures. - Added targeted regression coverage for the Railway volume fallback and the hosted checkout payload contract.
- Provisioned a real Railway production volume mounted at
/data, setTHUMBGATE_FEEDBACK_DIR=/data/feedback, and redeployed production so funnel and memory logs survive restarts. - Verified the live hosted
/checkout/proroute now creates a real Stripe Checkout Session redirect and that live attribution events persist to the durable telemetry ledger.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/worktrees/thumbgate-fix-prod-analytics:
npm ci
node --test tests/billing.test.js tests/feedback-loop.test.js
npm test
npm run test:coverage
tmp=$(mktemp -d) && THUMBGATE_PROOF_DIR="$tmp/proof" npm run prove:adapters
tmp=$(mktemp -d) && THUMBGATE_AUTOMATION_PROOF_DIR="$tmp/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --check
railway volume add -m /data --json
railway variable set THUMBGATE_FEEDBACK_DIR=/data/feedback RAILWAY_RUN_UID=0 --json
railway up -d -m "fix(billing): omit null stripe customer_email and default Railway feedback volume"
python3 - <<'PY'
import json, urllib.request
req = urllib.request.Request(
'https://thumbgate-production.up.railway.app/checkout/pro',
headers={'User-Agent': 'codex'},
method='GET'
)
opener = urllib.request.build_opener(urllib.request.HTTPRedirectHandler)
try:
opener.open(req)
except urllib.error.HTTPError as exc:
print(json.dumps({
'status': exc.code,
'location': exc.headers.get('Location')
}, indent=2))
PY
railway run -- python3 - <<'PY'
import json, os, urllib.request
req = urllib.request.Request(
'https://thumbgate-production.up.railway.app/v1/billing/summary',
headers={
'Authorization': f"Bearer {os.environ['THUMBGATE_API_KEY']}",
'User-Agent': 'codex'
}
)
with urllib.request.urlopen(req) as resp:
data = json.load(resp)
print(json.dumps({
'status': 'ok',
'paidOrders': data['revenue']['paidOrders'],
'bookedRevenueCents': data['revenue']['bookedRevenueCents'],
'bookedRevenueTodayCents': data['revenue']['bookedRevenueTodayCents'],
'paidOrdersToday': data['revenue']['paidOrdersToday'],
'funnelTotalEvents': data['funnel']['totalEvents'],
'acquisitionBySource': data['funnel']['acquisitionBySource'],
'acquisitionByCampaign': data['funnel']['acquisitionByCampaign'],
'acquisitionByCommunity': data['funnel']['acquisitionByCommunity'],
'acquisitionByPostId': data['funnel']['acquisitionByPostId'],
'acquisitionByCommentId': data['funnel']['acquisitionByCommentId'],
'acquisitionByCampaignVariant': data['funnel']['acquisitionByCampaignVariant'],
'acquisitionByOfferCode': data['funnel']['acquisitionByOfferCode']
}, indent=2))
PYObserved result:
npm ciexited0.node --test tests/billing.test.js tests/feedback-loop.test.jsexited0:32passed,0failed.npm testexited0.npm run test:coverageexited0with all-files coverage at89.46%lines,75.83%branches, and93.05%functions.THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed.THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.git diff --checkexited0.- Railway production volume
cd9d854e-4925-4c53-9b41-8f8840ebc889was created and mounted at/data. - Railway production variables now include:
THUMBGATE_FEEDBACK_DIR=/data/feedbackRAILWAY_RUN_UID=0RAILWAY_VOLUME_MOUNT_PATH=/data
- Railway deployment
a8c3e0cb-9d0a-4018-8f1a-60984ea44929succeeded for the exact code fix. - Live
GET /healthzreports:feedbackLogPath: /data/feedback/feedback-log.jsonlmemoryLogPath: /data/feedback/memory-log.jsonl
- Live
GET /checkout/pronow returns302withLocation: https://checkout.stripe.com/c/pay/cs_live_..., proving hosted checkout session creation is working again instead of falling back after a Stripe API failure. - Live billing summary after a fresh attributed visit reports:
paidOrders: 2bookedRevenueCents: 2000bookedRevenueTodayCents: 0paidOrdersToday: 0funnelTotalEvents: 1acquisitionBySource.ai_search: 1acquisitionByCampaign.prod_checkout_fix: 1acquisitionByCommunity.ClaudeCode: 1acquisitionByPostId.prod-checkout-fix: 1acquisitionByCommentId.proof-final: 1acquisitionByCampaignVariant.durable_volume: 1acquisitionByOfferCode.OPS-FINAL: 1
Requirements verified:
- Production analytics are now durable across Railway restarts because telemetry writes to the mounted volume instead of ephemeral container storage.
- Hosted Stripe checkout no longer fails on missing buyer email; the backend now omits
customer_emailrather than sendingnull. - Live attribution analytics are now persisted and queryable from the admin billing summary.
- The MCP has verified historical booked revenue (
$20.00), but booked revenue for March 19, 2026 remains truthfully0; the fix restores the purchase path and analytics instead of fabricating a same-day sale.
Scope:
- Added live Stripe revenue reconciliation to
scripts/billing.jsso historical successful charges tied to the current product are included in the billing summary without fabricating same-day revenue. - Switched the admin billing summary surfaces in
scripts/operational-summary.jsandsrc/api/server.jsto the live reconciliation path. - Replaced buyer-facing Gumroad links on active repo and runtime surfaces with the hosted
/checkout/proroute, while changing the fallback checkout URL default to the direct Stripe payment link. - Updated the live Railway production environment so
/checkout/pronow falls back to Stripe instead of Gumroad. - Deployed the exact worktree diff to Railway and verified the hosted billing summary now reports the reconciled Stripe revenue truth surface.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/thumbgate-revenue-proof:
node --test tests/billing.test.js tests/api-server.test.js tests/cli.test.js tests/version-metadata.test.js tests/recall-limit.test.js tests/public-landing.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:check
git diff --check
railway variable set THUMBGATE_CHECKOUT_FALLBACK_URL=https://buy.stripe.com/aFa4gz1M84r419v7mb3sI05
railway up -d -m "revenue proof analytics + stripe checkout fallback"
railway run node - <<'NODE'
const https = require('https');
const options = {
hostname: 'thumbgate-production.up.railway.app',
path: '/v1/billing/summary',
headers: {
authorization: `Bearer ${process.env.THUMBGATE_API_KEY}`,
'user-agent': 'codex'
}
};
const req = https.request(options, (res) => {
let body = '';
res.on('data', (chunk) => body += chunk);
res.on('end', () => {
console.log(JSON.stringify({ statusCode: res.statusCode, body: body ? JSON.parse(body) : null }, null, 2));
});
});
req.on('error', (err) => { console.error(err); process.exit(1); });
req.end();
NODEObserved result:
- Targeted monetization/runtime regression pack exited
0:116passed,0failed. npm testexited0in the dedicated worktree after the final checkout and analytics edits.npm run test:coverageexited0with all-files coverage at89.27%lines,75.79%branches, and93.01%functions.npm run prove:adaptersexited0:48passed,0failed.npm run prove:automationexited0:55passed,0failed.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.git diff --checkexited0.- Railway env-only redeploy
32717506-102b-4316-88d6-eddb6fdf7150succeeded after settingTHUMBGATE_CHECKOUT_FALLBACK_URLto the Stripe payment link. - Production
GET /checkout/pronow returns302tohttps://buy.stripe.com/aFa4gz1M84r419v7mb3sI05...instead of the old Gumroad URL. - Railway code deployment
a5fbff33-c410-46bf-b795-ced4163495acsucceeded for the exact worktree diff. - The live admin billing summary now returns
200and reports:paidOrders: 2bookedRevenueCents: 2000bookedRevenueTodayCents: 0paidOrdersToday: 0processorReconciledOrders: 2processorReconciledRevenueCents: 2000coverage.providerCoverage.stripe: booked_revenue+processor_reconciled
Requirements verified:
- Historical product revenue is now proven through live Stripe reconciliation instead of being hidden behind a false-zero billing summary.
- The production checkout fallback no longer leaks buyers to Gumroad; the hosted
/checkout/proroute now falls back to Stripe. - The repo truth surface now matches live production: the MCP has made money historically, but it is not making booked money on March 19, 2026.
Scope:
- Added
scripts/internal-agent-bootstrap.jsas a real runtime bootstrap module for internal coding-agent threads, not a marketing-only stub. - Added a first-class
bootstrap_internal_agentsurface across the MCP tool registry, API server, canonical OpenAPI spec, and Gemini function declarations. - Added worktree-backed sandbox preparation so bootstrap can create or reuse an isolated git worktree lane for execution.
- Added reviewer-lane planning in the bootstrap result so coding workflows can expose an optional evaluator/reviewer path without making multi-agent orchestration the default.
- Replaced dead placeholder MCP behavior for the tested tool surfaces in
adapters/mcp/server-stdio.jswith real dispatch and payload handling. - Hardened stdio transport behavior so framed and newline-delimited MCP initialization both work, while malformed ndjson still returns the legacy ndjson error envelope expected by the CLI contract.
- Preserved recall-limit commercial behavior while keeping real recall output and codegraph evidence intact after the adapter rewrite.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/thumbgate-open-swe-plan:
npm ci
node --test tests/internal-agent-bootstrap.test.js tests/mcp-server.test.js tests/openapi-parity.test.js tests/prove-adapters.test.js tests/cli.test.js tests/recall-limit.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR='/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.1Y2VqtnO1F/proof' npm run prove:adapters
env THUMBGATE_AUTOMATION_PROOF_DIR='/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.2gfdnB5cPh/proof-automation' npm run prove:automation
npm run self-heal:check
git status --short
git diff --statObserved result:
npm ciexited0;150packages installed,151audited,0vulnerabilities. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.xKAQXDEA64/npm-ci.log- Focused Open SWE regression pack exited
0:111passed,0failed. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.QAW02kTBU1/open-swe-regression.log npm testexited0end-to-end in the clean worktree. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.gkSlebXonX/npm-test.lognpm run test:coverageexited0with all-files coverage at86.97%lines,75.15%branches, and92.49%functions. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.omsAOIgi5P/test-coverage.logenv THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.1Y2VqtnO1F/prove-adapters.logenv THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.2gfdnB5cPh/prove-automation.lognpm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.gqN6I7Gfkp/self-heal-check.log- Current tracked diff after the implementation:
16changed paths in the worktree14tracked files changed with1050insertions and20deletions2new files:scripts/internal-agent-bootstrap.jsandtests/internal-agent-bootstrap.test.js
- Post-sync verification after merging
origin/mainintocodex/open-swe-adoption-planfor PR#258also passed:npm ciexited0. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.l6cgkKKJVv/npm-ci-merge.lognpm testexited0. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.RRXVoXMSMg/npm-test-merge.lognpm run test:coverageexited0with all-files coverage at89.57%lines,75.72%branches, and93.10%functions. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.S5vGznS7oW/test-coverage-merge.logenv THUMBGATE_PROOF_DIR=... npm run prove:adaptersexited0:48passed,0failed. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.0QvYuLywFK/prove-adapters-merge.logenv THUMBGATE_AUTOMATION_PROOF_DIR=... npm run prove:automationexited0:55passed,0failed. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.LIxnkdcu80/prove-automation-merge.lognpm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks. Log:/var/folders/yw/2qhx3yzj0psf87rdxh8lqlmm0000gp/T/tmp.Xr81UMZGaf/self-heal-check-merge.log
Requirements verified:
bootstrap_internal_agentis now reachable through the API route, MCP tool surface, canonical OpenAPI document, ChatGPT adapter OpenAPI mirror, and Gemini declarations.- Bootstrap can normalize GitHub/Slack/Linear-style invocations, build startup context, create or reuse a git worktree sandbox, and emit a reviewer-lane recommendation for coding tasks.
- The MCP stdio server still accepts both
Content-Lengthframed requests and newline-delimited JSON requests after the adapter rewrite. - Malformed ndjson input still returns the expected ndjson error envelope, which keeps
tests/cli.test.jsgreen instead of silently changing the transport contract. - Recall still returns actual results, includes codegraph evidence, and appends the post-limit upgrade nudge after five calls.
- Adapter proof coverage increased to
48passing checks because the new bootstrap surface is exercised by bothapi.internal_agent.bootstrapandmcp.tools.call.bootstrap_internal_agent.
Scope:
- Added
scripts/workflow-runs.jsas a dedicated local ledger for proof-backed workflow runs, reviewed runs, paid team runs, and named pilot agreements. - Added a first-class
north-starCLI command and aπ― North Starsection in the dashboard so the repo now reports the stated product metric directly instead of only adjacent revenue and telemetry proxies. - Added
scripts/verify-run.jssonpm run verify:fullrecords a proof-backed workflow run after the full suite passes. - Fixed billing and dashboard truth surfaces to use the active feedback directory discovery logic instead of being split between
.thumbgate/and legacy.claude/memory/feedback/defaults. - Added safe reconciliation logic for historical paid-provider events so legacy paid funnel events become honest
paidOrderswithout fabricating booked revenue. - Wired hosted deployment examples and secret sync flows for durable runtime feedback storage and optional analytics/search-console variables:
THUMBGATE_FEEDBACK_DIR,THUMBGATE_GA_MEASUREMENT_ID, andTHUMBGATE_GOOGLE_SITE_VERIFICATION. - Added regression coverage for workflow-run persistence, North Star CLI output, dashboard reporting, direct local telemetry persistence, and billing reconciliation when the revenue ledger is absent.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/thumbgate-northstar-20260318-135757:
npm ci
node --test tests/workflow-runs.test.js tests/billing.test.js tests/dashboard.test.js tests/cli.test.js
node --test tests/billing.test.js tests/dashboard.test.js
npm run verify:full
node bin/cli.js north-star
node bin/cli.js dashboard
env _TEST_FUNNEL_LEDGER_PATH='/Users/ganapolsky_i/workspace/git/igor/thumbgate/.claude/memory/feedback/funnel-events.jsonl' \
_TEST_REVENUE_LEDGER_PATH='/tmp/thumbgate-empty-revenue-events.jsonl' \
_TEST_API_KEYS_PATH='/tmp/thumbgate-empty-api-keys.jsonl' \
node -e "const { getBillingSummary } = require('./scripts/billing'); const summary = getBillingSummary(); console.log(JSON.stringify({ paidProviderEvents: summary.revenue.paidProviderEvents, paidOrders: summary.revenue.paidOrders, bookedRevenueCents: summary.revenue.bookedRevenueCents, derivedPaidOrders: summary.revenue.derivedPaidOrders, unreconciledPaidEvents: summary.revenue.unreconciledPaidEvents }, null, 2));"Observed result:
npm cicompleted with0vulnerabilities.node --test tests/workflow-runs.test.js tests/billing.test.js tests/dashboard.test.js tests/cli.test.jscompleted successfully for the workflow-run, billing, dashboard, and CLI ledger surface.node --test tests/billing.test.js tests/dashboard.test.js:27passed,0failed.npm run verify:fullexited0and completed the standard full suite:npm testnpm run test:coveragenpm run prove:adaptersnpm run prove:automationnpm run self-heal:check
npm run test:coveragepassed with all-files coverage at89.79%lines,76.04%branches, and93.43%functions.npm run prove:adapters:46passed,0failed.npm run prove:automation:55passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.node bin/cli.js north-starnow reports the explicit product metric:Weekly proof-backed workflow runs : 1Weekly teams on proof-backed runs : 1Reviewed workflow runs : 1Paid orders : 4Booked revenue : $0.00North Star status : tracking
node bin/cli.js dashboardnow includes the dedicatedπ― North Starsection and reports:Weekly Proof Runs: 1Weekly Teams : 1Reviewed Runs : 1Paid Team Runs : 0Named Pilots : 0Status : trackingCustomer Proof : missing
- Historical revenue-truth proof against the real legacy funnel ledger now reconciles paid-stage events correctly without claiming revenue that is not provable:
paidProviderEvents: 23paidOrders: 23bookedRevenueCents: 0derivedPaidOrders: 23unreconciledPaidEvents: 0
Requirements verified:
- The repo now tracks its documented North Star directly: weekly active proof-backed workflow runs.
- Full-suite verification automatically writes a proof-backed workflow-run record after successful completion.
- Dashboard and CLI truth surfaces agree on the same North Star state instead of only showing indirect commercial or telemetry proxies.
- Historical paid-provider events are no longer stranded as unreconciled paid-stage funnel noise when the revenue ledger is missing.
- Hosted deployment tooling now has first-class support for durable runtime feedback storage and optional GA/Search Console wiring without introducing tracked runtime state.
Scope:
- Replaced the public Workflow Hardening Sprint
mailto:dependency with a hosted sprint-intake form on the landing page, including structured CTA tracking and success/failure handling. - Added
scripts/workflow-sprint-intake.jsas the single owner for sprint-intake lead capture, writing contactable workflow leads to the active local feedback runtime asworkflow-sprint-leads.jsonl. - Added
POST /v1/intake/workflow-sprintto the hosted API and wired the landing form to it. - Strengthened public machine-readable positioning with
Organization,SoftwareApplication,BuyAction, andCommunicateActionschema on the public landing page. - Routed active outreach and social assets to the hosted sprint-intake path instead of stale email-first or legacy-growth messaging.
- Integrated workflow-sprint lead counts into the admin billing/CFO summary so pipeline capture is visible in the same truth surface as booked revenue, while explicitly keeping leads separate from revenue claims.
- Corrected operator scripts so
pulse.jsandmoney-watcher.jskey off booked revenue and paid orders instead of unreconciled paid-stage funnel events. - Hardened
tests/delegation-runtime.test.jstemp-dir cleanup so clean-worktree coverage runs no longer fail with transientENOTEMPTYteardown errors.
Commands run in the dedicated clean verification worktree at /tmp/thumbgate-verify-first-dollar-20260317 on exact branch head ba83de2:
npm ci
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR=/tmp/thumbgate-verify-first-dollar-ba83de2/proof-adapters npm run prove:adapters
env THUMBGATE_AUTOMATION_PROOF_DIR=/tmp/thumbgate-verify-first-dollar-ba83de2/proof-automation npm run prove:automation
npm run self-heal:check
git status --shortAdditional targeted GTM/commercial regressions run in the implementation worktree:
node --test tests/public-landing.test.js tests/api-server.test.js tests/workflow-hardening-sprint.test.js tests/social-marketing-assets.test.js tests/version-metadata.test.js tests/commercial-signals.test.js tests/billing.test.js tests/cli.test.jsObserved result:
npm cicompleted with0vulnerabilities.npm testpassed end-to-end on exact branch headba83de2.npm run test:coveragepassed with1108passed,0failed,1skipped.- All-files coverage on the verified tree:
90.18%lines,76.29%branches,93.55%functions. env THUMBGATE_PROOF_DIR=/tmp/thumbgate-verify-first-dollar-ba83de2/proof-adapters npm run prove:adapters:46passed,0failed.env THUMBGATE_AUTOMATION_PROOF_DIR=/tmp/thumbgate-verify-first-dollar-ba83de2/proof-automation npm run prove:automation:55passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.git status --shortremained empty after the full clean-worktree suite.- Targeted GTM/commercial regression pack passed with
98tests passed,0failed.
Requirements verified:
- The public sprint offer now has a direct hosted intake path for qualified workflow demand instead of forcing an email handoff.
- Sprint-intake leads are captured as structured local runtime records and exposed in the admin billing/CFO summary without being misrepresented as revenue.
- Public positioning, outreach assets, billing truth surfaces, and operator scripts now agree on the same commercial story: Workflow Hardening Sprint for pipeline, Pro for self-serve $19/mo access.
- Clean-worktree verification is stable again after hardening the delegation test teardown.
Scope:
- Fixed the merged Databricks analytics export so its default output root now uses
getFeedbackPaths()instead of a legacy.claudefallback, keeping implicit bundle writes inside the same safe data boundary used by the API and MCP adapters. - Normalized Databricks bundle-relative paths to POSIX separators before embedding them in
manifest.jsonandload_databricks.sql, preventing Windows-hosted exports from generating backslash-separated paths that Databricks SQL cannot read. - Added regression coverage for:
- default export-path selection when
.thumbgate/is present - API default export path behavior
- MCP default export path behavior
- bundle-relative path normalization
- default export-path selection when
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/thumbgate-databricks-followup:
npm ci
node --test tests/databricks-export.test.js tests/api-server.test.js tests/mcp-server.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved result:
npm cicompleted with0vulnerabilities.- Targeted Databricks regressions passed:
51tests passed,0failed. npm testpassed end-to-end on the follow-up branch after the post-merge fixes were applied.npm run test:coveragepassed with1041tests,1040passed,0failed,1skipped.- All-files coverage on the follow-up branch:
83.47%lines,69.70%branches,86.40%functions. env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:46passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:47passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Requirements verified:
- The Databricks export no longer escapes the safe feedback root when no explicit
outputPathis provided. - The Databricks SQL bootstrap always uses forward-slash bundle-relative paths, including on Windows-originated exports.
- API and MCP default exports now inherit the same root-selection behavior as the shared ThumbGate feedback pipeline.
Scope:
- Added
scripts/export-databricks-bundle.jsto export the local ThumbGate control plane into a Databricks-ready analytics bundle instead of coupling the runtime system to an external warehouse. - Export now emits
feedback_events.jsonl,memory_records.jsonl,feedback_sequences.jsonl,feedback_attributions.jsonl,proof_reports.jsonl,manifest.json, and a bootstrapload_databricks.sqltemplate with catalog/schema placeholders. - Added the bundle export to every primary surface:
- CLI:
npx thumbgate export-databricks - HTTP API:
POST /v1/analytics/databricks/export - MCP:
export_databricks_bundle
- CLI:
- Updated policy and adapter metadata so intent planning, OpenAPI parity, and Gemini function declarations expose the new analytics-plane export consistently.
- Kept the smart-learning review fix on the same branch and verified it still passes after the Databricks export surface was added.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/thumbgate-smart-learning-fix:
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved result:
- Targeted Databricks/API/MCP/OpenAPI/CLI regressions passed:
101tests passed,0failed. npm testpassed end-to-end on the worktree after the analytics export surface and smart-learning fix were combined.npm run test:coveragepassed with1024tests,1023passed,0failed,1skipped.- All-files coverage on the verified tree:
83.44%lines,69.92%branches,86.33%functions. env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:46passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:43passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Requirements verified:
- The Databricks export is reachable and consistent across CLI, HTTP API, MCP, ChatGPT OpenAPI, and Gemini declarations.
- The bundle contains local ThumbGate memory, attribution, sequence, and proof-report tables without mutating the control-plane storage model.
- The generated SQL bootstrap keeps external warehouse details parameterized rather than hard-coding catalog/schema paths into the product.
- Codegraph-aware intent planning, recall, and proof flows still pass after the analytics export path was introduced.
Scope:
- Added
scripts/failure-diagnostics.jswith a narrow failure taxonomy forinvalid_invocation,tool_output_misread,intent_plan_misalignment,guardrail_triggered, andsystem_failure. - Compiled diagnosis constraints from workflow contract rules, gate policies, session constraints, approval checkpoints, and MCP tool schemas.
- Added the
diagnose_failureMCP tool and made it profile-aware so locked/read-only profiles diagnose disallowed tool calls correctly instead of pretending the full tool catalog is available. - Threaded diagnoses into the verification loop, self-healing health checks, dashboard aggregation, analytics, and prevention-rule generation through a shared
diagnostic-log.jsonlpath. - Removed false-positive fallback diagnoses so vague or unsupported negative signals no longer inflate root-cause metrics.
- Updated
README.mdso the MCP tool inventory and profile counts match the shipped product surface.
Commands run in the dedicated worktree at /Users/ganapolsky_i/workspace/git/igor/thumbgate/.claude/worktrees/agent-agentrx:
npm ci
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved result:
npm cicompleted with0vulnerabilities.npm testpassed end-to-end on the post-fix tree after the review-found diagnostic gaps were closed.npm run test:coveragepassed with1018tests,1017passed,0failed,1skipped.- All-files coverage on the post-fix tree:
83.43%lines,69.93%branches,86.36%functions. npm run prove:adapters:46passed,0failed.npm run prove:automation:43passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Evidence artifacts verified:
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.md
Requirements verified:
diagnose_failureno longer fabricatestool_output_misreadfor vague or unclassified failures with no real evidence.diagnose_failurenow respects MCP profile allowlists and emits policy-backed invalid-invocation diagnoses for disallowed tools.- Failed verification runs persist diagnoses into the shared analytics path instead of dying inside transient return payloads.
self-heal:checkpersists unhealthy-check diagnoses into the same shared analytics path when run via CLI.- Dashboard and prevention-rule outputs now include persisted verification and self-heal diagnoses, not only diagnoses attached during feedback capture.
- The README tool inventory now matches the shipped MCP surface: essential profile remains
5tools, full profile is12tools includingdiagnose_failure.
Scope:
- Removed accidental tracked
.claude/worktrees/agent-*gitlinks from the repository index so disposable worktree lanes stop pollutingmain. - Removed tracked live
.thumbgate/*runtime artifacts from version control and aligned.gitignorewith the repo policy that ThumbGate memory/state is local operational data. - Persisted the runtime-state hygiene rule in
AGENTS.md,CLAUDE.md, andGEMINI.md. - Archived unique orphan branches before deletion and removed clean redundant worktrees/branches with no active PR or verification role.
Commands run:
git fetch --all --prune
git worktree add /Users/ganapolsky_i/workspace/git/igor/thumbgate-pr-hygiene-20260313 -b chore/pr-hygiene-20260313 origin/main
npm ci
env THUMBGATE_API_KEY=ci-secret npm test
env THUMBGATE_API_KEY=ci-secret npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:check
npm audit --json
git diff --checkObserved result:
- GitHub open PRs:
0. mainCI was already green onbbfa45576d3ea7136e544e68662253079646feeb.npm cicompleted with0vulnerabilities.env THUMBGATE_API_KEY=ci-secret npm testpassed end-to-end.env THUMBGATE_API_KEY=ci-secret npm run test:coveragepassed with971passed,0failed,1skipped and all-files coverage at82.59%lines,68.77%branches,85.37%functions.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:38passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:37passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:check:Overall: HEALTHYwith4/4checks healthy.npm audit --jsonreported0open vulnerabilities.git diff --checkpassed with no whitespace or patch-format defects.
Cleanup evidence:
- Tracked branch count:
22 -> 18. - Worktree count:
18 -> 7. - Archived before deletion:
archive/20260313/chore-stripe-incident-responsearchive/20260313/docs-update-product-tiersarchive/20260313/feat-deep-document-infrastructurearchive/20260313/feat-fix-verification-failuresarchive/20260313/feat-free-tier-limitsarchive/20260313/feat-step-feedback-exportarchive/20260313/pr-190-readonlyarchive/20260313/worktree-agent-a6591335archive/20260313/worktree-agent-a7dc457b
- Removed clean redundant worktrees/branches:
chore/pr-cleanup-20260312feat/context-hub-preflightfeat/local-provider-abstractionworktree-agent-ade17c3c- detached verification worktree
/Users/ganapolsky_i/workspace/git/igor/thumbgate-techdebt-audit - stale
mainworktree/Users/ganapolsky_i/workspace/git/igor/thumbgate-partner-aware-orchestration
- Repository hygiene change size:
42tracked runtime artifacts removed from source control,1286tracked lines deleted.
Requirements verified:
- Disposable worktree lanes are no longer a versioned part of the product repository.
- ThumbGate runtime state now matches the documented local-only operating model instead of creating tracked churn in every session.
- Unique orphan branches were preserved before deletion, while clean redundant lanes were removed outright.
- The verification suite still passes after moving runtime state out of version control.
Scope:
- Fixed the free-tier gate loading regression in
scripts/gates-engine.jsso core default gates always load and free-tier capping applies only to auto-promoted add-on gates. - Removed dead duplicate
/healthzrouting insrc/api/server.js. - Removed the legacy in-memory recall limiter in
adapters/mcp/server-stdio.js, switched recall usage to the shared rate-limiter, and kept the free-tier upgrade nudge without dropping recall results. - Hardened
tests/recall-limit.test.jsso CI-provided secrets likeTHUMBGATE_API_KEYcannot bypass the free-tier assertions. - Added exact feedback-memory deduplication in
scripts/contextfs.jsso repeated identical lessons no longer create duplicate ContextFS entries. - Hardened CI to install and verify the
workers/package, aligned Stripe worker code with the current SDK API version, and removed the repo-localwranglerdependency because the current npm advisories did not leave a clean vendored release line. - Deleted six duplicate ThumbGate memory entries that were already storing the same lessons.
Baseline snapshot before changes:
Commands run in dedicated baseline worktree at 57a7498e42578270a2dc1421c1bfd8d06f07dded:
git worktree add /Users/ganapolsky_i/workspace/git/igor/thumbgate-audit-baseline 57a7498e42578270a2dc1421c1bfd8d06f07dded
npm ci
npm --prefix workers ci
node --test tests/contextfs.test.js tests/intent-router.test.js tests/verification-loop.test.js tests/mcp-server.test.js
npm --prefix workers audit --json
npm run test:coverageObserved baseline result:
- Core RAG/orchestration snapshot passed:
57tests passed,0failed acrosstests/contextfs.test.js,tests/intent-router.test.js,tests/verification-loop.test.js, andtests/mcp-server.test.js. npm --prefix workers audit --jsonreported4moderate vulnerabilities in the worker dependency chain (esbuild,wrangler,miniflare,undici).npm run test:coverageexited non-zero on the pre-audit tree with957passed,4failed,1skipped.- Baseline coverage summary still emitted:
82.07%lines,68.96%branches,85.52%functions. - The failing baseline regressions were:
tests/gates-engine.test.js: protected-branch and.envgate expectations failed.tests/recall-limit.test.js: sixth recall call never emitted the upgrade nudge.
Commands run on the audit branch:
npm ci
npm --prefix workers ci
npm run test:gates
node --test tests/contextfs.test.js
THUMBGATE_API_KEY=ci-secret node --test tests/recall-limit.test.js
THUMBGATE_API_KEY=ci-secret npm run test:api
node --test tests/mcp-server.test.js tests/api-server.test.js
THUMBGATE_API_KEY=ci-secret npm test
THUMBGATE_API_KEY=ci-secret npm run test:coverage
npm run test:workers
env THUMBGATE_PROOF_DIR="$(mktemp -d)" THUMBGATE_API_KEY=ci-secret npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" THUMBGATE_API_KEY=ci-secret npm run prove:automation
env THUMBGATE_PROOF_DIR="$(mktemp -d)" THUMBGATE_API_KEY=ci-secret npm run prove:workflow-contract
env THUMBGATE_PROOF_DIR="$(mktemp -d)" THUMBGATE_API_KEY=ci-secret npm run prove:autoresearch
THUMBGATE_API_KEY=ci-secret npm run self-heal:check
npm --prefix workers audit --json
wrangler deploy --dry-runObserved result:
npm testpassed end-to-end after the audit changes.npm run test:coveragepassed with968passed,0failed,1skipped.- Current coverage summary on the final audit head:
82.42%lines,68.76%branches,85.10%functions. npm run test:gates,node --test tests/contextfs.test.js,THUMBGATE_API_KEY=ci-secret node --test tests/recall-limit.test.js, andnode --test tests/mcp-server.test.js tests/api-server.test.jsall passed.THUMBGATE_API_KEY=ci-secret npm run test:apipassed, proving the recall-limit regression is fixed under the same hosted-key environment GitHub Actions uses.npm run test:workerspassed after the worker package gained a dedicated type-check test script.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:38passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:37passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:workflow-contract:6passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:autoresearch:Phase 9 proof: 5 passed, 0 failed.THUMBGATE_API_KEY=ci-secret npm run self-heal:check:Overall: HEALTHYwith4/4checks healthy.npm --prefix workers ci,npm run test:workers, andnpm --prefix workers audit --jsonall passed with0vulnerabilities after removing the directwranglerdependency from the repo-local worker package.wrangler deploy --dry-runpassed fromworkers/via the globally installed Wrangler CLI (4.63.0).
Requirements verified:
- Free-tier users keep the default safety gates (
force-push,protected-branch-push,.envedits) while still capping auto-promoted add-on gates. - Recall requests now share the real rate-limiter state and still return useful content after the free tier is exhausted.
- Recall-limit verification no longer depends on CI secrets or shared test-state, so the free-tier upgrade nudge is exercised deterministically in GitHub Actions.
- Exact duplicate feedback-memory lessons no longer create duplicate ContextFS records, and the repositoryβs duplicate tracked memory entries were removed.
- The worker package is now covered by CI install and test steps instead of being outside the main pipeline.
- The worker package no longer vendors a vulnerable Wrangler release in-repo; deploys and
wrangler typescontinue to use the globally installed CLI already required byworkers/README.md.
Scope:
- Added
config/partner-routing.jsonandscripts/partner-orchestration.jsto define reusable partner profiles, aliases, token-budget rules, and reward coefficients. - Threaded optional
partnerProfilethrough the HTTP API, MCP adapter, and OpenAPI surfaces so intent planning can return a partner-specific strategy summary. - Updated the intent router and verification loop to adapt action ranking, token budgets, retry behavior, and Thompson updates for
partner_<profile>reliability learning. - Extended the automation proof harness and regression suite to verify partner-aware planning and emitted strategy metadata.
Commands run:
npm ci
node --test tests/intent-router.test.js tests/verification-loop.test.js tests/thompson-sampling.test.js tests/async-job-runner.test.js
node --test tests/api-server.test.js tests/mcp-server.test.js tests/prove-automation.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:checkObserved result:
- Both targeted regression commands passed with
0failures across partner orchestration, API, MCP, and automation-proof coverage. npm testpassed end-to-end after adding partner-aware orchestration.npm run test:coveragepassed with all-files coverage at82.52%lines,68.69%branches, and85.19%functions.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:38passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:37passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:check:Overall: HEALTHYwith4/4checks healthy.
Evidence artifacts:
- Targeted
node --testoutput coveringtests/intent-router.test.js,tests/verification-loop.test.js,tests/thompson-sampling.test.js,tests/async-job-runner.test.js,tests/api-server.test.js,tests/mcp-server.test.js, andtests/prove-automation.test.js. - Ephemeral adapter and automation proof reports emitted under temporary
THUMBGATE_PROOF_DIRdirectories so verification did not leave tracked proof churn in the repository.
Requirements verified:
partnerProfileis accepted by the public API and MCPplan_intentandlist_intentssurfaces and reaches the runtime planner.- Intent plans now emit partner strategy metadata and adapt token budgets plus action ranking for strict, fast, silent-blocker, tool-limited, and balanced counterparts.
- Verification updates now learn partner-specific reliability in Thompson sampling under
partner_<profile>categories without weakening the existing hard gate model. - The automation proof harness now checks for
intent.partner_strategy, so the new orchestration behavior is covered by proof, not only by unit tests.
Scope:
- Replaced stale low-dollar self-serve subscription language on live-facing surfaces with the actual public offer: Pro (
$19/mo). - Removed unsupported scarcity and adoption framing from CLI and landing-page copy.
- Added
docs/COMMERCIAL_TRUTH.mdas the source of truth for pricing, traction, and proof claims.
Commands run:
node --test tests/version-metadata.test.js tests/api-server.test.js tests/cli.test.jsRequirements verified:
- Live-facing copy no longer presents a public recurring subscription as the current self-serve offer.
- Live-facing copy no longer treats repo metrics or hardcoded scarcity as customer proof.
- Pricing and traction claims now point back to a single source of truth.
Scope:
- Added a shared operational billing summary in
scripts/billing.jsthat merges the funnel ledger with the local key store. - Added admin-only
GET /v1/billing/summaryplus the repo-localnode bin/cli.js cfocommand so API, CLI, watcher, and strategist surfaces share the same summary shape. - Replaced fake paid-line revenue guessing in operator scripts with the new billing summary proxy.
Commands run:
node --test tests/billing.test.js tests/api-server.test.js tests/cli.test.js tests/openapi-parity.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:checkObserved result:
- Targeted regression coverage passed:
63tests passed,0failed across billing, API server, CLI, and OpenAPI parity. npm testpassed end-to-end after adding the CFO control plane.npm run test:coveragepassed with all-files coverage at82.18%lines,68.13%branches, and84.90%functions.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:38passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:35passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:check:Overall: HEALTHYwith4/4checks healthy.
Evidence artifacts:
- Command output from the targeted regression run is the primary proof for the new CFO control plane.
- Ephemeral
THUMBGATE_PROOF_DIRdirectories were used for adapter and automation proof runs to avoid tracked proof churn.
Requirements verified:
- Billing funnel telemetry, active keys, disabled keys, customer usage, and source attribution now resolve from one shared summary shape instead of ad hoc paid-line counting.
GET /v1/billing/summaryis admin-only and rejects provisioned billing keys.node bin/cli.js cforeturns the same machine-readable summary shape as the API surface, while reading the local ledger and key store in the current checkout.- This surface is an operational billing proxy with ledger-backed
bookedRevenueCentsfor providers that emit known amounts; it still does not claim invoice truth.
Status:
- Historical pricing experiment notes only.
- Superseded by
docs/COMMERCIAL_TRUTH.mdfor current public pricing and proof language.
Scope:
- Version sync across
package.json,mcpize.yaml, andserver.jsontov0.7.1. - Historical pricing experiment: tested a low-dollar founder offer and urgency hooks before the current commercial-truth correction.
- Discovery optimization: Added high-ROI GitHub topics and updated
SKILL.mdauto-indexing keywords. - Launch content package: Created
docs/marketing/LAUNCH_CONTENT.mdwith Reddit, HN, and Discord assets. - CLI
procommand was, at that time, updated to reflect the same historical pricing experiment.
Commands run:
npm test
npm run test:proof
npm run test:coverage
npm run prove:adapters
npm run prove:automation
node bin/cli.js help
node bin/cli.js stats
gh repo view --json repositoryTopicsObserved results:
npm test: 100% pass across all 329 tests.npm run test:proof: all proof gates PASS.npm run prove:adapters:{ "passed": 24, "failed": 0 }.node bin/cli.js stats: Successfully triggered Revenue-at-Risk analyzer showing operational loss metrics.gh repo view: Verified topics includingagentic-feedback-studio,veto-layer, andzero-config.
Evidence artifacts:
public/index.htmlpoints checkout and fallback flow at the canonical Railway hosted app.docs/marketing/LAUNCH_CONTENT.mdexists and contains high-intent hooks.SKILL.mdupdated withagent-memoryandclaude-codekeywords.
Requirements verified:
- Pricing and fallback routing align with the current hosted billing funnel.
- Repository is optimized for auto-discovery by AI search and MCP directories.
- Technical integrity is maintained with a 100% test pass rate.
Commands:
node --test tests/deployment.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:checkObserved result:
- Targeted deployment verification passed:
9tests passed,0failed intests/deployment.test.js. npm testpassed end-to-end on the narrowed hotfix diff with only the Railway deploy regression coverage added.npm run test:coveragepassed with overall coverage at82.97%lines,69.36%branches, and86.81%functions.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:24 passed,0 failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:14 passed,0 failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run self-heal:check:HEALTHYwith4/4checks healthy.
Evidence artifacts:
- Focused deployment regression output from
node --test tests/deployment.test.js. - Ephemeral machine-readable proof reports emitted under temporary
THUMBGATE_PROOF_DIRdirectories during the adapter and automation proof runs.
Requirements verified:
- The CI deploy workflow now refuses to enter the Railway deploy path unless explicit repo configuration is present for token, project, environment, and health-check inputs.
- The workflow no longer depends on the previously hard-coded Cloud Run health URL when validating a Railway deploy.
- The hotfix is scoped to deploy-gate behavior plus regression coverage; no unrelated runtime or proof harness changes were required to keep the branch green.
Commands:
node --test --experimental-test-coverage --test-concurrency=1 tests/cli.test.js
node --test --test-concurrency=1 tests/prove-adapters.test.js
npm test
npm run test:coverage
npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved result:
- Targeted CLI coverage verification passed:
22tests passed,0failed intests/cli.test.js. - Targeted adapter proof verification passed:
38tests passed,0failed intests/prove-adapters.test.js. npm testpassed end-to-end after hardening the subprocess handshake budget used by the CLI and adapter proof harnesses.npm run test:coveragepassed with720tests passed,0failed, and1skipped.- Coverage summary:
83.17%lines,69.34%branches,86.86%functions. npm run prove:adapters:24 passed,0 failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:14 passed,0 failed.npm run self-heal:check:HEALTHYwith4/4checks healthy.
Evidence artifacts:
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.md
Requirements verified:
- The CLI
servehandshake test no longer flakes under full-suite coverage because the helper tolerates realistic subprocess startup latency and surfaces child process spawn errors explicitly. - The adapter proof harness no longer times out its MCP stdio checks under heavy test load because its subprocess handshake budget matches observed startup behavior.
- Fatal adapter-proof errors now identify the exact MCP or adapter stage that failed instead of attributing late-stage transport failures to the preceding API step.
Commands:
npm ci
node --test tests/adapters.test.js tests/install-mcp.test.js tests/cli.test.js
node --test tests/prove-adapters.test.js tests/prove-lancedb.test.js
npm test
npm run prove:adapters
npm run prove:automation
node scripts/prove-lancedb.js
npm run self-heal:check
npm run test:coverageObserved result:
npm cicompleted successfully with0 vulnerabilities.- Targeted launcher verification passed:
39tests passed,0failed acrosstests/adapters.test.js,tests/install-mcp.test.js, andtests/cli.test.js. - Targeted proof cleanup verification passed:
39tests passed,0failed acrosstests/prove-adapters.test.jsandtests/prove-lancedb.test.js. npm testpassed end-to-end after hardening MCP launcher generation and retry-based cleanup in the proof scripts.npm run prove:adapters:24 passed,0 failed.npm run prove:automation:14 passed,0 failed.node scripts/prove-lancedb.js:5 passed,0 failed,0 warned.npm run self-heal:check:HEALTHYwith4/4checks healthy.npm run test:coveragepassed with overall coverage at83.16%lines,69.30%branches, and86.86%functions (719passed,0failed,1skipped).
Evidence artifacts:
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.mdproof/lancedb-report.jsonproof/lancedb-report.md
Requirements verified:
- Source checkouts now install canonical MCP entries that launch the local stdio server directly via
node adapters/mcp/server-stdio.js. - Portable docs and adapter examples now use the version-pinned launcher
npx -y [email protected] serveinstead of an unpinnednpxcall that can be shadowed by stale local installs. - Re-running the MCP installer upgrades stale config entries instead of treating them as already configured.
- Adapter and LanceDB proof cleanup now uses retry-capable recursive removal so ephemeral filesystem contention no longer flakes CI.
- Transient
.thumbgatereminder/A2UI/test-run files are now ignored as local runtime state and do not pollute git hygiene during verification.
Commands:
npm ci
node --test tests/api-server.test.js tests/version-metadata.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved result:
npm cicompleted successfully with0 vulnerabilities.- Targeted landing-page verification passed:
25tests passed,0failed acrosstests/api-server.test.jsandtests/version-metadata.test.js. npm testpassed end-to-end after the public messaging and GTM doc changes.npm run test:coveragepassed with a serialized Node test runner (--test-concurrency=1) so suites that rewriteprocess.envdo not race each other during coverage.- The ADK consolidation path stayed hermetic under test:
- first-run anchor-only consolidation no longer exits early
ADK_FAKE_CONSOLIDATION=trueis honored only underNODE_ENV=test- the anchor-memory test opts into deterministic consolidation instead of a live Gemini path
- Coverage summary:
83.20%lines,69.28%branches,86.78%functions. npm run prove:adapters:24 passed,0 failed.npm run prove:automation:14 passed,0 failed.npm run self-heal:check:HEALTHYwith4/4checks healthy.
Evidence artifacts:
- Targeted landing/API verification was exercised directly by the commands above.
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.md
The command output above is the primary evidence for this run. The tracked proof artifacts listed here were refreshed locally by the proof commands and serve as machine-readable corroboration.
Requirements verified:
- Public-facing GTM surfaces now lead with one workflow outcome instead of generic agent infrastructure.
- The landing page preserves
SoftwareApplicationandFAQPageJSON-LD while adding buyer-facing FAQ and comparison content. - The GTM plan link referenced by the landing page now resolves to
docs/GO_TO_MARKET_REVENUE_WEDGE_2026-03.md. - The ADK consolidator and spike/anchor coverage path is deterministic again and no longer blocks the proof gate.
Commands:
npm ci
npm test
npm run test:coverage
npm run prove:workflow-contract
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved result:
- Clean install completed with
0 vulnerabilities. npm testpassed end-to-end, including the newtest:workflowcontract gate.npm run test:coveragepassed after hardeningtests/adk-consolidator.test.jsto use explicit deterministic consolidation in test mode instead of relying on a live Gemini key.- Coverage summary:
83.39%lines,67.58%branches,86.63%functions. npm run prove:workflow-contract:4 passed,0 failed.npm run prove:adapters:21 passed,0 failed.npm run prove:automation:14 passed,0 failed.npm run self-heal:check:HEALTHYwith4/4checks healthy.
Evidence artifacts:
proof/workflow-contract/report.jsonproof/workflow-contract/report.mdproof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.md
Requirements verified:
- Repo-owned
WORKFLOW.mdcontract exists and encodes scope, hard stops, proof commands, and done criteria. - Agent intake is bounded by
.github/ISSUE_TEMPLATE/ready-for-agent.yml. - PR handoff now requires proof-first structure via
.github/pull_request_template.md. - CI runs machine validation for the workflow contract and uploads workflow-proof artifacts.
- Proof report:
proof/attribution-report.md - Machine evidence:
proof/attribution-report.json - Requirements: ATTR-01 (recordAction + attributeFeedback), ATTR-02 (pre-tool guard), ATTR-03 (test coverage)
Command:
node scripts/prove-rlaif.jsObserved result:
- Summary:
4 passed,0 failed - Evidence artifacts:
proof/rlaif-report.jsonproof/rlaif-report.md
- Requirements verified:
- DPO-01: selfAudit() returns score float in [0,1] with 6 constraints; selfAuditAndLog() writes self-score-log.jsonl
- DPO-02: dpoOptimizer.run() writes dpo-model.json with generated + pairs_processed fields
- DPO-03: extractMetaPolicyRules() extracts rules from seeded negative entries; meta-policy-rules.json written
- DPO-04: node --test all 3 RLAIF test files: 24 passing tests, 0 failures; delta from Phase 4 baseline (93): +24 RLAIF tests = 117 total
Command:
npm testResult summary:
test:schema: 7 passed, 0 failedtest:loop: 10 passed, 0 failedtest:dpo: 6 passed, 0 failedtest:api: 52 passed, 0 failedtest:proof: 2 passed, 0 failed
Command:
npm run prove:adaptersObserved result:
- Summary:
21 passed,0 failed - Evidence artifacts:
proof/compatibility/report.jsonproof/compatibility/report.md
- Verified checks include:
- API auth and feedback/context/intents routes
- Rubric-based gating for positive feedback (
422when guardrails/disagreement fail) - Rubric-aware context evaluation payloads
- API auth config hardening (
THUMBGATE_API_KEYrequired unless insecure mode enabled) - Context namespace traversal rejection on API + MCP surfaces
- Intent router checkpoint flow (
checkpoint_requiredfor unapproved high-risk intents) - MCP initialize/list/call flow (including
plan_intentand rubric-gatedcapture_feedback) - MCP locked-profile write denial
- OpenAPI parity for ChatGPT adapter
- Gemini declaration validity
- Subagent profile and MCP policy consistency
Command:
npm run prove:automationObserved result:
- Summary:
14 passed,0 failed - Evidence artifacts:
proof/automation/report.jsonproof/automation/report.md
- Verified checks include:
- rubric-pass positive promotion
- rubric-gated positive rejection for guardrail/disagreement violations
- rubric failure dimensions in prevention rules
- rubric metadata in DPO output
- API + MCP rubric gate behavior
- intent checkpoint enforcement
- rubric-aware context evaluation
- semantic-cache hit behavior for similar context queries
- self-healing helper execution health checks
Commands:
npm run self-heal:check
node scripts/self-healing-check.js --json > proof/automation/self-healing-health.json
node scripts/self-heal.js --reason=manual > proof/automation/self-heal-run.jsonObserved result:
- Health status:
healthy(4/4 checks healthy: budget, tests, adapter proof, automation proof) - Self-heal execution:
healthy: true, no failing fix steps - Evidence artifacts:
proof/automation/self-healing-health.jsonproof/automation/self-heal-run.json
Command sequence:
- Start API with
THUMBGATE_API_KEY=test-keyon port8791 GET /healthzwith bearer tokenGET /v1/feedback/statswithout token (expect 401)POST /v1/feedback/capturewith valid payloadGET /v1/feedback/summary
Observed results:
- Health endpoint responded with status
ok - Unauthorized stats call returned
401 - Capture endpoint returned
accepted: trueand produced memory record - Summary endpoint returned markdown summary payload
- Unauthorized API request returns
401(default auth required). - API initialization fails fast if
THUMBGATE_API_KEYis missing and insecure mode is not explicitly enabled. - API rejects external output paths outside feedback root.
- MCP
prevention_rulesblocks externaloutputPath. - MCP
export_dpo_pairsblocks externalmemoryLogPath. - MCP allowlists enforce profile-scoped tool access (
default,readonly,locked). - Rubric anti-hacking gate blocks unsafe positive memory promotion when guardrails fail or judges disagree.
GitHub API checks:
allow_auto_merge: truedelete_branch_on_merge: truemainbranch protection retains:- required approvals:
1 - required check contexts:
["test"] - required linear history:
true - required conversation resolution:
true
- required approvals:
Workflow syntax validation command:
for f in .github/workflows/*.yml; do ruby -e 'require "yaml"; YAML.load_file(ARGV[0]); puts "OK #{ARGV[0]}"' "$f"; doneObserved result:
- All workflow files parsed successfully (
OKfor each).
Command:
npm run budget:statusObserved result:
- Month:
2026-03 - Tracked spend:
0 - Budget:
10 - Remaining:
10
Command:
npm run diagrams:paperbananaObserved blocker:
- PaperBanana call reached Gemini endpoint and failed with
400 INVALID_ARGUMENT(API_KEY_INVALID). - This proves integration path is wired, but the provided key is not currently valid for generation.
Current status:
- Diagram pipeline is implemented and budget-guarded.
- Final diagram artifacts require a valid Gemini/Google API key.
- Failed generation attempts do not increase budget ledger spend.
Scope:
- Added MCP stdio transport compatibility for both
Content-Lengthframed JSON-RPC and newline-delimited JSON requests. - Fixed CLI
servebootstrap to explicitly start the stdio listener when loaded viarequire(). - Removed duplicate/dead
serveswitch branch collision withstart-api. - Hardened proof/test reliability for external repo discovery and proof test determinism.
Commands run:
node --test tests/cli.test.js tests/prove-adapters.test.js
npm run test:proof
npm test
npm run prove:adapters
npm run prove:automationObserved results:
tests/cli.test.js: pass (includes framed + newlineinitializehandshake coverage)tests/prove-adapters.test.js: pass with adapter proof checks increased to>=21npm run test:proof: pass (75pass,0fail)npm test: pass (all scripted test phases complete)npm run prove:adapters:{ "passed": 21, "failed": 0 }npm run prove:automation:{ "passed": 14, "failed": 0 }
Artifacts updated:
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.md
Implemented a staged internal analytics pipeline so ThumbGate can materialize raw -> staging -> semantic -> lineage instead of recomputing business metrics as an ad hoc live join.
Scope:
- Added
scripts/agentic-data-pipeline.jsfor snapshot materialization, lineage, reconciliation, and managed job generation. - Updated
scripts/semantic-layer.jsto read from the shared staged pipeline contract. - Extended
scripts/schedule-manager.js,scripts/verify-run.js, andscripts/self-healing-check.jsso the lane is automated and proof-backed.
Commands run:
npm run test:data-pipeline
npm run test:semantic-layer
npm run prove:data-pipeline
node --test tests/data-pipeline.test.js tests/semantic-layer.test.js tests/schedule-manager.test.js tests/prove-data-pipeline.test.js tests/verify-run.test.js tests/self-healing-check.test.jsObserved results:
npm run test:data-pipeline:5/5passingnpm run test:semantic-layer:5/5passingnpm run prove:data-pipeline:6 passed, 0 failed- Focused regression bundle:
42/42passing
Behavioral proof points:
- Identical reruns now hold a stable snapshot id and downgrade lineage to
noop. - Reconciliation warns on unreconciled paid events and attribution drift instead of silently flattening them.
- Semantic metrics now expose
attributionCoverageRate,unreconciledPaidEvents, andpipelineWarnings. - Managed schedule specs can materialize the pipeline through the interruptible async-job runner without bespoke shell glue.
Artifacts updated:
proof/data-pipeline-report.jsonproof/data-pipeline-report.md
Scope:
- Harness Score added to the dashboard so operators can see correction coverage, enforcement coverage, diagnostic coverage, repeat-failure pressure, and top next harness fixes in one place.
search_lessonsupgraded to expose lifecycle state, linked corrective actions, linked prevention rules, linked auto-promoted checks, and next harness recommendations for each lesson.- MCP/CLI/tool metadata updated so the new harness-improvement surface is discoverable to agents.
Commands run:
npm ci
node --test tests/dashboard.test.js tests/lesson-search.test.js tests/api-server.test.js tests/cli.test.js tests/mcp-server.test.js tests/mcp-tools-gates.test.js tests/positioning-contract.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved results:
- Focused harness suite:
152 passed,0 failed. npm test: pass. Final test aggregate completed cleanly with558 pass,0 fail.npm run test:coverage: pass with:- statements:
88.50% - branches:
74.32% - functions:
92.54%
- statements:
npm run prove:adapters:48/48passed.npm run prove:automation:55/55passed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Behavioral proof points:
- Dashboard empty-state now reports a bootstrapping Harness Score instead of omitting harness-health entirely.
- Repeated negative lessons now surface concrete next actions such as
pre_action_gate,prevention_rule,verification_harness, anddiagnostic_capture. - Fully enforced lessons with linked rules and gates report zero open harness recommendations.
- API and CLI lesson-search output now includes lifecycle state and harness recommendations, not just memory text.
Files changed:
README.mdscripts/dashboard.jsscripts/lesson-search.jsscripts/tool-registry.jstests/api-server.test.jstests/cli.test.jstests/dashboard.test.jstests/lesson-search.test.js
Scope:
- Added
search_thumbgateas a read-only MCP tool for raw ThumbGate search across feedback logs, ContextFS memory, and prevention rules. - Added authenticated
GET /v1/searchandPOST /v1/searchAPI routes with OpenAPI parity. - Restored reusable ContextFS pack templates for bug investigation, session resume, sales-call prep, and competitor scans.
- Preserved
search_lessonsas the canonical promoted-lesson search surface while salvaging the broader raw-search lane.
Commands run:
npm ci
node --test --test-concurrency=1 tests/pack-templates.test.js tests/thumbgate-search.test.js tests/openapi-parity.test.js tests/commerce-quality.test.js tests/profile-router.test.js tests/intent-router.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved results:
- Targeted ThumbGate search suite:
75/75passing. npm test: exit0.npm run test:coverage: exit0withall files | 88.44 | 74.11 | 92.50.npm run prove:adapters: exit0with48/48passing.npm run prove:automation: exit0with55/55passing.npm run self-heal:check:Overall: HEALTHYwith4/4 healthy.
Behavioral proof points:
search_thumbgateis registered as read-only in the MCP tool registry and returns merged or source-filtered ThumbGate search results./v1/searchis present in the API root JSON listing and in both canonical and ChatGPT OpenAPI specs.search_lessonscall semantics remain unchanged whilesearch_thumbgateadds broader retrieval over raw ThumbGate state.- ContextFS pack templates are exported, enumerable, and validated by dedicated tests.
Scope:
- Reduced the root README from a long narrative sales page into a shorter operator-facing overview.
- Added an explicit
Tech Stacksection covering runtime, interfaces, storage, retrieval, enforcement, billing, and hosting. - Preserved the repo contract requirements for
WORKFLOW.md, theready-for-agentintake template,Commercial Truth, and the free/self-hostedsearch_lessonssurface.
Commands run:
wc -l README.md
node --test tests/positioning-contract.test.js tests/version-metadata.test.js tests/prove-workflow-contract.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved results:
README.mdline count reduced from506to201.- Targeted contract/version/docs checks passed
23/23. npm testexited0.npm run test:coverageexited0withall files | 88.43 | 74.12 | 92.48.npm run prove:adaptersexited0with48/48checks passing.npm run prove:automationexited0with55/55checks passing.npm run self-heal:checkreportedOverall: HEALTHYand4/4 healthy.
Behavioral proof points:
- The root README now leads with the shipped product behavior instead of a long narrative sales page.
- The public docs now expose the actual technology stack directly in the README instead of forcing buyers to infer it from
package.json. - Required operator-contract links and free/self-hosted lesson-search messaging remain covered by automated tests.
Scope:
- Added a first-class lesson search surface so any MCP-compatible free or self-hosted agent can search promoted lessons and inspect the corrective action linked to each result.
- Exposed the feature through MCP (
search_lessons), HTTP (GET /v1/lessons/search), and CLI (npx thumbgate lessons/search-lessons). - Linked each lesson result to its source feedback, matching prevention rules, and matching auto-promoted checks.
- Updated public docs so the essential profile now advertises lesson search as a free/self-hosted MCP surface.
Commands run:
npm ci
node --test tests/lesson-search.test.js tests/test-suite-parity.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved results:
node --test tests/lesson-search.test.js tests/test-suite-parity.test.js: pass (4/4).npm test: pass.npm run test:coverage: pass with Node coverage summary:- line coverage:
88.34% - branch coverage:
74.23% - function coverage:
92.40%
- line coverage:
npm run prove:adapters: pass (48/48).npm run prove:automation: pass (55/55).npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.budget_status: healthy (567ms)tests: healthy (295323ms)prove_adapters: healthy (200474ms)prove_automation: healthy (119678ms)
Behavioral proof points:
search_lessonsis available in thedefault,essential,readonly,dispatch, andlockedMCP profiles.- Empty queries list recent lessons; text queries rank lessons by query overlap plus recency.
- Search responses expose
correctiveActionsderived from lesson content plus linked prevention rules and auto-promoted checks. GET /v1/lessons/searchand the ChatGPT adapter OpenAPI both include the new search route.- The CLI
lessonscommand prints lesson summaries together with linked corrective actions.
Artifacts updated:
README.mdadapters/chatgpt/openapi.yamladapters/mcp/server-stdio.jsbin/cli.jsconfig/mcp-allowlists.jsonopenapi/openapi.yamlpackage.jsonscripts/dispatch-brief.jsscripts/intent-router.jsscripts/lesson-search.jsscripts/tool-registry.jssrc/api/server.jstests/api-server.test.jstests/cli.test.jstests/intent-router.test.jstests/lesson-search.test.jstests/mcp-server.test.jstests/openapi-parity.test.jstests/positioning-contract.test.jstests/profile-router.test.js
Scope:
- Removed the duplicate Railway deploy job from
.github/workflows/ci.ymlsomainno longer triggers two concurrent deploy lanes. - Kept
.github/workflows/deploy-railway.ymlas the single authoritative Railway deploy workflow. - Preserved the dedicated deploy workflow's
18-attempt SHA verification budget frommaininstead of reintroducing a stale forked verifier contract. - Added workflow regression coverage so CI stays test-only and the dedicated deploy workflow keeps the Railway-specific logic.
Problem verified before the fix:
- PR
#287merged as commitdf5f93d, but Railway kept serving the previous build SHA93daccdfor the full8 x 10sverification window. - Failed deploy run
23354231413died inVerify deployment health, not inrailway up. - The same merge SHA still passed
CI,CodeQL, andPublish to NPM, which isolated the issue to deployment orchestration rather than application correctness.
Commands run:
node --test tests/deployment.test.js
npm test
npm run test:coverage
THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
THUMBGATE_AUTOMATION_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:check
git diff --checkObserved results:
node --test tests/deployment.test.jsexited0:15/15pass.npm testexited0.npm run test:coverageexited0with all-files coverage at89.69%statements,75.76%branches, and93.14%functions.npm run prove:adaptersexited0:48/48pass.npm run prove:automationexited0:55/55pass.npm run self-heal:checkexited0:Overall: HEALTHYwith4/4healthy checks.git diff --checkexited0.
Artifacts updated:
.github/workflows/ci.ymldocs/VERIFICATION_EVIDENCE.mdtests/deployment.test.js
Scope:
- Fixed the public
/.well-known/mcp/server-card.jsonroute to include full MCPinputSchemametadata for every tool, instead of only name and description. - Added an HTTP-level regression test proving the server card exposes tool schemas for directory scanners.
Problem verified before the fix:
- The public Smithery page for
thumbgate-loop/thumbgate-v2was live, but showedNo capabilities foundandNo deployments found. - Production already exposed unauthenticated metadata endpoints:
GET https://thumbgate-production.up.railway.app/.well-known/mcp/server-card.json->200GET https://thumbgate-production.up.railway.app/mcp->200POST https://thumbgate-production.up.railway.app/mcpwithinitialize->200POST https://thumbgate-production.up.railway.app/mcpwithtools/list->200
- The bug was that the live server-card route stripped
inputSchema, which made the static server card materially weaker thantools/list.
Commands run:
npm ci
npm --prefix workers ci
node --test tests/api-server.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)/proof" npm run prove:adapters
env THUMBGATE_AUTOMATION_PROOF_DIR="$(mktemp -d)/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved results:
node --test tests/api-server.test.js:54/54passingnpm test: exit0npm run test:coverage: exit0with89.58%lines,75.61%branches,93.07%functionsnpm run prove:adapters:48/48passingnpm run prove:automation:55/55passingnpm run self-heal:check:Overall: HEALTHYwith4/4healthy checksgit diff --check: exit0
Artifacts updated:
src/api/server.jstests/api-server.test.js
Scope:
- Exposed
buildShaonGET /healthfromTHUMBGATE_BUILD_SHA. - Updated the Railway deploy workflow to set
THUMBGATE_BUILD_SHAfor each deploy and wait until the live/healthpayload reports the exactGITHUB_SHA. - Closed the observed blind spot where a healthy old revision could satisfy the deploy job before the new revision was actually serving traffic.
Problem verified before the fix:
- PR
#285merged cleanly and GitHub markedDeploy to Railwaysuccessful. - The live public endpoint still served the pre-fix server-card shape immediately after that success signal.
- Railway runtime proof showed a new deployment existed, but the GitHub workflow only checked for HTTP
200, not revision identity.
Commands run:
node --test tests/api-server.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)/proof" npm run prove:adapters
env THUMBGATE_AUTOMATION_PROOF_DIR="$(mktemp -d)/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved results:
node --test tests/api-server.test.js:54/54passingnpm test: exit0npm run test:coverage: exit0with89.58%lines,75.59%branches,93.07%functionsnpm run prove:adapters:48/48passingnpm run prove:automation:55/55passingnpm run self-heal:check:Overall: HEALTHYwith4/4healthy checksgit diff --check: exit0
Artifacts updated:
.github/workflows/deploy-railway.ymlsrc/api/server.jstests/api-server.test.js
Scope:
- Added a least-privilege
dispatchMCP profile for remote review, recall, planning, diagnostics, and metrics. - Blocked handoff and write workflows when
THUMBGATE_MCP_PROFILE=dispatch. - Added a
dispatchCLI brief so paired-device operators can get a phone-safe operational snapshot without opening write-capable surfaces. - Updated docs so Dispatch usage routes code and memory mutations back into a dedicated worktree with the
defaultprofile.
Commands run:
npm ci
node --test tests/mcp-policy.test.js tests/agent-readiness.test.js tests/delegation-runtime.test.js tests/dispatch-brief.test.js tests/cli.test.js
npm run test:cli
npm test
npm run test:coverage
THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
THUMBGATE_AUTOMATION_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved results:
- Targeted Dispatch lane tests:
55/55pass,0fail. npm run test:cli:82/82pass,0fail.npm test: exit0.npm run test:coverage: exit0.- all files:
89.63%statements,75.35%branches,93.03%functions.
- all files:
npm run prove:adapters:48/48pass,0fail.npm run prove:automation:55/55pass,0fail.npm run self-heal:check:Overall: HEALTHY,4/4healthy.
Behavioral proof points:
dispatchprofile exposes metrics, diagnostics, recall, rule inspection, and planning tools while denyingcapture_feedbackandstart_handoff.- Permission readiness reports
dispatchaswriteCapable: falsewith explicit guidance to switch back todefaultin a dedicated worktree before edits. - Delegation runtime treats
dispatchas a single-agent review profile and rejects handoff starts with adispatch_profileblock reason. dispatch --jsonemits a remote brief with allowed tasks, blocked tasks, key metrics, and prompt templates for phone-safe usage.
Artifacts updated:
docs/guides/dispatch-ops.mddocs/guides/mcp-use-integration.mddocs/PLUGIN_DISTRIBUTION.mddocs/marketing/mcp-directories.md
Scope:
- Repo-wide technical debt sweep from a dedicated worktree rooted at
origin/main. - PR manager merge gate hardening for pending CI and required-review blockers.
- Python trainer cleanup plus new CI smoke coverage for the tracked Python script.
- Direct dependency drift reduction for
@google/genai.
Commands run:
npm ci
node --test tests/pr-manager.test.js tests/train-from-feedback.test.js
python3 -m py_compile scripts/train_from_feedback.py
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved results:
npm ci: clean install,0 vulnerabilities.node --test tests/pr-manager.test.js tests/train-from-feedback.test.js:12passed,0failed.python3 -m py_compile scripts/train_from_feedback.py: exit0.npm test: exit0.npm run test:coverage: exit0,all files | 89.60 | 75.65 | 93.07.npm run prove:adapters:48/48passed.npm run prove:automation:55/55passed.npm run self-heal:check:Overall: HEALTHY,4/4 healthy.
Behavioral proof points:
- Autonomous PR merges now stop on pending checks instead of treating them as ready.
REVIEW_REQUIREDis now treated as an explicit blocker for autonomous merging.- The tracked Python trainer is CI-smoke-tested so syntax regressions fail fast.
- The Python trainer no longer repeats category initialization logic or carries stale repository guidance.
Scope:
- PR-management reliability when operating from a branch without an attached PR.
- Default test-gate completeness for repository test files.
- Removal of stale tracked test-output artifacts.
- Fresh technical-debt audit snapshot and verification evidence.
Baseline before changes:
- Tracked files:
573 - Tracked lines:
115434 - Coverage baseline from a separate clean
origin/mainworktree:- lines:
89.50% - branches:
75.64% - functions:
92.90%
- lines:
mainGitHub CI status onfb78e8ae1a36dbdb92dd93867a278c60c92a41c0: passing
Audit findings fixed:
npm run pr:managefailed withno pull requests found for branch ...on clean worktree branches.npm testomitted23repository test files despite those tests passing independently.- There was no regression guard to stop future
npm testdrift from the actualtests/**/*.test.jsinventory. test_output.txtwas a checked-in command transcript with no code or documentation references.
Targeted proof commands:
node --test tests/contextfs.test.js tests/feedback-to-memory.test.js tests/vector-store.test.js
node --test tests/mcp-server.test.js tests/intent-router.test.js tests/async-job-runner.test.js
node --test tests/pr-manager.test.js tests/test-suite-parity.test.js
npm run test:ops
npm run pr:manageObserved targeted results:
- Local memory/RAG proof batch:
27/27passing. - Orchestration proof batch:
53/53passing. - PR-manager + parity guard batch:
9/9passing. npm run test:ops:171/171passing.npm run pr:manage: clean noop with[PR Manager] No open pull requests found.
Full verification commands:
npm ci
npm --prefix workers ci
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)/proof" npm run prove:adapters
env THUMBGATE_AUTOMATION_PROOF_DIR="$(mktemp -d)/proof-automation" npm run prove:automation
npm run self-heal:check
npm audit --json
npm --prefix workers audit --json
git diff --checkObserved final results:
npm ci: exit0npm --prefix workers ci: exit0npm test: exit0npm run test:coverage: exit0- lines:
89.57% - branches:
75.48% - functions:
93.06%
- lines:
npm run prove:adapters:48/48passingnpm run prove:automation:55/55passingnpm run self-heal:check:Overall: HEALTHYwith4/4 healthynpm audit --json:0vulnerabilitiesnpm --prefix workers audit --json:0vulnerabilitiesgit diff --check: exit0
Artifacts updated:
docs/TECHNICAL_DEBT_AUDIT.md
Scope:
- Hosted-first operator truth for
north-staranddashboard. - Admin-only workflow sprint state advancement from
new -> qualified -> named_pilot -> proof_backed_run -> paid_team. - Pricing-decision Sprint CTA at the same moment buyers currently choose Pro.
- OpenAPI parity for the new sprint advancement route and dashboard window parameters.
Key files changed:
scripts/operational-dashboard.jsscripts/dashboard.jssrc/api/server.jsscripts/workflow-sprint-intake.jsscripts/workflow-runs.jsbin/cli.jspublic/index.htmlopenapi/openapi.yamladapters/chatgpt/openapi.yaml
Targeted proof commands:
node --test tests/workflow-runs.test.js tests/workflow-sprint-intake.test.js tests/public-landing.test.js
node --test --test-concurrency=1 tests/api-server.test.js tests/openapi-parity.test.js tests/telemetry-analytics.test.js
node --test tests/cli.test.js tests/revenue-status.test.jsTargeted proof results:
- Workflow + landing batch:
17tests passed,0failed. - API + OpenAPI + telemetry batch:
69tests passed,0failed. - CLI + revenue-status batch:
41tests passed,0failed.
Behavioral proof points:
POST /v1/intake/workflow-sprint/advanceis admin-only and rejects non-static billing keys with403.- Sprint lead advancement appends immutable lead snapshots, creates workflow-run evidence for
named_pilot,proof_backed_run, andpaid_team, and preserves deduplicated North Star counts. GET /v1/dashboardnow acceptswindow,timezone, andnow, and its revenue/traffic numbers follow the live billing-summary path for that window.north-starnow prefers the hosted operational dashboard whenTHUMBGATE_BILLING_API_BASE_URL,THUMBGATE_API_KEY, andTHUMBGATE_METRICS_SOURCE=hostedare configured.- The pricing section now includes
data-cta-id="pricing_sprint"pointing directly to#workflow-sprint-intake. - Canonical OpenAPI and ChatGPT adapter specs stay byte-aligned after adding
/v1/intake/workflow-sprint/advance.
Full verification protocol:
npm ci
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)/proof" npm run prove:adapters
env THUMBGATE_AUTOMATION_PROOF_DIR="$(mktemp -d)/proof-automation" npm run prove:automation
npm run self-heal:check
git diff --checkObserved results:
npm ci: exit0; audit reported0vulnerabilities.npm test: exit0.npm run test:coverage: exit0with aggregate coverage:- line coverage:
89.53% - branch coverage:
75.73% - function coverage:
93.02%
- line coverage:
npm run prove:adapters: exit0with48passed,0failed.npm run prove:automation: exit0with55passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.git diff --check: exit0.
Low-debt implementation notes:
- No new dependencies were added.
- Hosted metrics reuse
getBillingSummaryLive()plus the existing dashboard generator rather than creating a second analytics stack. - Sprint state transitions reuse the existing append-only lead ledger and workflow-run ledger rather than introducing a new database path.
Scope:
- Read-time GitHub Marketplace amount reconciliation for legacy paid revenue rows that were previously persisted with
amountKnown: false. - Explicit dry-run/write repair command for the local gitignored revenue ledger.
- Marketplace pricing metadata capture on new webhook writes so future repairs are auditable.
Commands run:
node --test tests/billing.test.js
node --test tests/github-billing.test.js
node --test tests/cli.test.js
npx thumbgate repair-github-marketplace
npx thumbgate repair-github-marketplace --writeObserved results:
tests/billing.test.jspasses the new backfill coverage:- summary books revenue from a legacy GitHub Marketplace row at read time when configured pricing is available
repairGithubMarketplaceRevenueLedger({ write: true })rewrites the local ledger with amount, currency, interval, and repair metadata
tests/github-billing.test.jsconfirms new Marketplace writes now persist billing cycle, unit count, price model, and pricing source metadatatests/cli.test.jsconfirmsrepair-github-marketplacesupports both preview mode and--write
Behavioral proof points:
- Legacy GitHub Marketplace rows no longer stay stranded as permanent
amountKnown: falseentries when a trusted plan-price mapping exists. - The billing summary can surface booked revenue truth immediately from reconciled legacy Marketplace rows before a write-back is applied.
- The explicit repair command materializes that truth into the local
.thumbgateor legacy feedback ledger without fabricating prices.
Scope:
- Tightened the landing page around the Workflow Hardening Sprint as the front-line commercial motion.
- Added a current sprint brief for one workflow, one owner, and one proof review.
- Aligned README, pitch, Anthropic partner strategy, outreach targets, cold outreach, LinkedIn, Reddit, and X assets to the same workflow-hardening story.
- Added regression coverage so the public and sales surfaces do not drift back to generic AI-employee or infrastructure-first language.
Commands run:
npm ci
node --test tests/public-landing.test.js tests/api-server.test.js tests/social-marketing-assets.test.js tests/version-metadata.test.js tests/anthropic-partner-strategy.test.js tests/workflow-hardening-sprint.test.js
npm test
npm run test:coverage
THUMBGATE_PROOF_DIR=/tmp/thumbgate-workflow-hardening-20260317T133407/proof/compatibility npm run prove:adapters
THUMBGATE_AUTOMATION_PROOF_DIR=/tmp/thumbgate-workflow-hardening-20260317T133407/proof/automation npm run prove:automation
npm run self-heal:checkObserved results:
- Targeted GTM regression suite:
58pass,0fail. npm test: pass.npm run test:coverage: pass with Node test runner coverage summary:- line coverage:
84.39% - branch coverage:
70.73% - function coverage:
87.26%
- line coverage:
npm run prove:adapters: pass with46passed,0failed.npm run prove:automation: pass with55passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.- Proof artifacts for adapter and automation verification were redirected to
/tmp/thumbgate-workflow-hardening-20260317T133407/proofso the clean worktree did not churn trackedproof/artifacts.
Behavioral proof points:
public/index.htmlnow sells the Workflow Hardening Sprint first, keeps Pro truthful and secondary, exposes a proof CTA, and adds Sprint FAQ/schema support without fake partner claims.docs/WORKFLOW_HARDENING_SPRINT.mdnow defines the actual service offer, qualification rules, deliverables, contact path, and proof-pack requirement.docs/PITCH.md,docs/ANTHROPIC_MARKETPLACE_STRATEGY.md,docs/OUTREACH_TARGETS.md, anddocs/marketing/cold-outreach-sequence.mdnow align on the same 30-day revenue motion: founder-led outbound, one workflow, one owner, one proof review.docs/marketing/social-posts.md,docs/marketing/linkedin-ai-reliability-post.md,docs/marketing/reddit-posts.md, anddocs/marketing/x-launch-thread.mdnow frame the product as workflow hardening instead of generic AI-employee hype.tests/public-landing.test.js,tests/api-server.test.js,tests/social-marketing-assets.test.js,tests/version-metadata.test.js,tests/anthropic-partner-strategy.test.js, andtests/workflow-hardening-sprint.test.jsnow guard the new commercial story against future drift.
Artifacts updated:
README.mddocs/WORKFLOW_HARDENING_SPRINT.mddocs/PITCH.mddocs/ANTHROPIC_MARKETPLACE_STRATEGY.mddocs/OUTREACH_TARGETS.mddocs/marketing/cold-outreach-sequence.mddocs/marketing/social-posts.mddocs/marketing/linkedin-ai-reliability-post.mddocs/marketing/reddit-posts.mddocs/marketing/x-launch-thread.mdpublic/index.html
Scope:
- Fixed
scripts/self-healing-check.jsso proof-bearing health checks run with an isolated temporaryTHUMBGATE_PROOF_DIR. - Prevented
self-heal:checkfrom failing on clean merge commits due to shared trackedproof/artifacts instead of real behavioral regressions. - Added regression coverage to prove the health checker both injects and cleans temporary proof directories.
Commands run:
git diff --check
node --test tests/self-healing-check.test.js
npm ci
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved results:
git diff --check: completed cleanly.node --test tests/self-healing-check.test.js:14passed,0failed.npm ci: completed successfully;audited 151 packagesandfound 0 vulnerabilities.npm test: passed.npm run test:coverage:1100tests,1099passed,0failed,1skipped; coverage84.40%lines,70.77%branches,87.18%functions.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:46passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:55passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Behavioral proof points:
DEFAULT_CHECKSnow marks bothprove_adaptersandprove_automationfor proof-directory isolation.collectHealthReportprovisions a tempTHUMBGATE_PROOF_DIRper proof check and removes it after execution.- The repaired
self-heal:checknow stays healthy under the same heavytests + prove_*workload that failed on merge commit9b5f5a1.
Artifacts updated:
docs/VERIFICATION_EVIDENCE.md
Scope:
- Tighten the public category from generic memory phrasing to an AI reliability system for one sharp agent.
- Add optional GA4 and Google Search Console support alongside the existing Plausible + first-party telemetry stack.
- Auto-record SEO landing views from organic and AI-search referrers.
- Surface instrumentation readiness directly in the dashboard so traffic, funnel, revenue, and attribution gaps are explicit.
Commands run in the implementation worktree:
npm ci
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved results:
npm ci: passed,150packages installed,0vulnerabilities.npm test: passed onfeat/growth-observability.npm run test:coverage: passed with overall coverage84.37%lines /70.58%branches /87.17%functions.npm run prove:adapters: passed,46/46.npm run prove:automation: passed,55/55.npm run self-heal:check:Overall: HEALTHY,4/4healthy.git diff --check: clean before commit.
Behavioral proof points:
- The landing page keeps Plausible and first-party telemetry, and now injects GA4 and Search Console only when explicit env vars are set.
- Search and AI-search referrers now produce
seo_landing_viewtelemetry instead of hiding in generic landing-page traffic. - The dashboard now reports whether traffic analytics, SEO verification, buyer-loss capture, and revenue attribution are configured and actually receiving events.
- Public and active product copy now lead with AI reliability without orchestration tax instead of drifting back toward generic memory-layer framing.
Scope:
- Repositioned the public landing page around Claude workflow hardening, code modernization, and consultancy/platform-team use cases while keeping the no-orchestration-tax core message intact.
- Added a proof-forward hero CTA and explicit proof-pack link to
VERIFICATION_EVIDENCE.md. - Rewrote
docs/ANTHROPIC_MARKETPLACE_STRATEGY.mdas the current Anthropic partner strategy for Claude workflow hardening with packaged offers, buyer story, proof-pack rules, and claim hygiene. - Updated
docs/marketing/x-launch-thread.mdto a role-based workflow-hardening thread aligned with the public landing message. - Added regression coverage for the new partner strategy, landing copy, API root rendering, social-marketing messaging, and version-metadata expectations.
Commands run:
npm ci
node --test tests/public-landing.test.js tests/api-server.test.js tests/anthropic-partner-strategy.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved results:
- Targeted partner/landing/API tests: pass (
43pass,0fail). npm test: pass.npm run test:coverage: pass with overall coverage:- line coverage:
84.35% - branch coverage:
70.74% - function coverage:
87.14%
- line coverage:
npm run prove:adapters: pass with46pass,0fail.npm run prove:automation: pass with55pass,0fail.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Behavioral proof points:
public/index.htmlnow sells the product as Claude workflow hardening with seven concrete buyer/use-case cards, three packaged offers, and a proof-pack CTA instead of generic continuity-only framing.public/index.htmlpreservesSoftwareApplicationandFAQPageJSON-LD while adding consultancy/code-modernization FAQ coverage and keeping the no-orchestration-tax contract intact.docs/ANTHROPIC_MARKETPLACE_STRATEGY.mdis now a current-state partner strategy doc, not a stale historical note, and explicitly forbids false partner-membership claims while linking commercial truth and proof.docs/marketing/x-launch-thread.mdnow aligns the social message with workflow hardening and code modernization instead of generic "AI employee" hype.tests/public-landing.test.js,tests/api-server.test.js,tests/anthropic-partner-strategy.test.js,tests/social-marketing-assets.test.js, andtests/version-metadata.test.jsenforce the new GTM messaging and claim-hygiene contracts.
Scope:
- Repositioned the active social launch copy from a generic memory tool toward an AI reliability system for coding agents.
- Added a canonical operator kit for LinkedIn, X, and Reddit under
docs/marketing/. - Added local/private SVG source assets for a six-slide LinkedIn carousel and an X summary card under
docs/marketing/assets/. - Added a regression test to keep the new positioning and asset inventory from drifting.
Commands run:
node --test tests/social-marketing-assets.test.js
npm run test:workflow
git diff --checkObserved results:
tests/social-marketing-assets.test.js: passnpm run test:workflow: passgit diff --check: clean
Behavioral proof points:
docs/marketing/social-posts.mdis now the canonical social launch kit and points to current LinkedIn, X, and Reddit assets instead of older memory-first launch copy.docs/marketing/linkedin-ai-reliability-post.mdcontains the current long-form founder post plus the six-slide carousel script and first-comment CTA.docs/marketing/x-launch-thread.mdcontains the current nine-post thread focused on reliability, not just memory.docs/marketing/reddit-posts.mdcontains the currentr/ClaudeCodepost plus a showcase-safer/ClaudeAIvariant.docs/marketing/assets/contains local/private export-ready SVG assets for LinkedIn and X, avoiding shared-workspace dependency for final posting assets.
Scope:
- Repositioned the public landing page and package metadata around reliability without orchestration or subagent handoff overhead.
- Added explicit FAQ and hero copy that keeps one sharp agent as the primary product story.
- Tightened the continuity guide so it clearly frames the Gateway as the downstream reliability layer, not another planner or swarm.
- Added a positioning contract test so README, package metadata, guide copy, and landing-page assertions cannot drift back to generic memory-layer messaging.
Commands run:
node --test tests/public-landing.test.js tests/positioning-contract.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:check
npm run test:workflow
git diff --checkObserved results:
tests/public-landing.test.js: passtests/positioning-contract.test.js: passnpm test: passnpm run test:coverage: pass1094tests,1093passed,0failed,1skipped- coverage
84.39%lines,70.80%branches,87.14%functions
npm run prove:adapters: pass,46/46npm run prove:automation: pass,55/55npm run self-heal:check:Overall: HEALTHY,4/4 healthynpm run test:workflow: passgit diff --check: clean
Behavioral proof points:
public/index.htmlnow promisesKeep one sharp agentand explicitly says the Gateway works without another orchestration layer or subagent handoff tax.public/index.htmlFAQ now answers whether subagents or orchestration are required and states that the product is meant to keep one sharp agent on task.README.mdnow leads withLocal-first reliability layer for AI coding agentsinstead of generic context-and-memory phrasing.package.jsonnow carries reliability-over-orchestration positioning into npm and marketplace metadata.docs/guides/continuity-tools-integration.mdnow documents the recommended split: continuity upstream, one base agent doing the work, Gateway downstream as the reliability layer.docs/marketing/LAUNCH_CONTENT.mdnow aligns older launch variants with the reliability-without-orchestration story instead of stale persistent-memory-first copy.tests/positioning-contract.test.jsnow guards the launch-content variants as well, so active GTM docs cannot silently drift back to memory-layer messaging.
Scope:
- Added a repo-root Cursor marketplace manifest at
.cursor-plugin/marketplace.json. - Added a dedicated Cursor plugin bundle in
plugins/cursor-marketplace/with.cursor-plugin/plugin.json,.mcp.json, README, and committed logo asset. - Switched the Cursor launcher to the portable published package entrypoint
npx -y [email protected] serveinstead of any checkout-local absolute path. - Removed the stale
.mcp.json.pluginlegacy config file so the repo has one canonical Cursor packaging path. - Extended
scripts/sync-version.jsso Cursor manifests and all pinned launcher docs stay version-synced on future releases. - Added regression coverage for the repo-level marketplace contract, manifest/version consistency, and MCP launcher safety.
Commands run in the dedicated worktree at /private/tmp/thumbgate-cursor-marketplace-20260317T074440Z:
npm ci
npm --prefix workers ci
node scripts/sync-version.js --check
node --test tests/adapters.test.js tests/version-metadata.test.js tests/cursor-plugin.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:check
git diff --checkObserved result:
npm cicompleted with0vulnerabilities.npm --prefix workers cicompleted with0vulnerabilities.node scripts/sync-version.js --check:β All 16 targets in sync at v0.7.1.- Targeted Cursor packaging regressions passed:
18tests passed,0failed. npm testpassed end-to-end on the Cursor marketplace branch.npm run test:coveragepassed with all-files coverage of83.92%lines,70.52%branches, and86.81%functions.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:46passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:47passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.git diff --checkcompleted cleanly.
Requirements verified:
- The Cursor marketplace root manifest resolves to a valid repo-relative plugin directory.
- The Cursor marketplace manifest, Cursor plugin manifest, Claude plugin manifest, and package version remain synchronized.
- The Cursor plugin launcher uses the published npm package and does not hardcode
/Users/...checkout paths. - The multi-plugin marketplace contract is internally consistent: the marketplace entry name matches the plugin manifest name.
- Version-sync automation now owns the pinned Cursor launcher docs instead of leaving release drift behind.
Scope:
- Added a dedicated revenue ledger to separate booked revenue from generic paid-stage funnel telemetry.
- Preserved honest provider coverage: Stripe records booked revenue; GitHub Marketplace records paid orders by default and records booked revenue when the webhook payload carries plan pricing or plan pricing is explicitly configured.
- Threaded attribution metadata (
source, UTM fields, referrer, landing path, CTA id) through public checkout creation, funnel events, revenue events, API summaries, CLI CFO output, and the hosted landing page. - Replaced hardcoded marketing proof-strip vanity numbers with stable evidence-backed claims on the public landing page.
Commands run:
npm ci
env THUMBGATE_API_KEY=test-api-key node --test tests/billing.test.js tests/api-server.test.js tests/github-billing.test.js tests/cli.test.js tests/stripe-webhook-route.test.js
env THUMBGATE_API_KEY=test-api-key node --test tests/openapi-parity.test.js tests/adapters.test.js tests/commerce-quality.test.js
env THUMBGATE_API_KEY=ci-secret npm test
env THUMBGATE_API_KEY=ci-secret npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run self-heal:checkObserved results:
npm ci: completed successfully;audited 151 packagesandfound 0 vulnerabilities.- Targeted changed-surface suite:
76 passed,0 failed. - OpenAPI / adapter / commerce suite:
27 passed,0 failed. npm test: completed successfully across schema, loop, API, proof, E2E, billing, CLI, watcher, workflow, autoresearch, gates, and hardening phases.npm run test:coverage:971 passed,0 failed,1 skipped; coverage82.59%lines,68.77%branches,85.37%functions.npm run prove:adapters:38 passed,0 failed.npm run prove:automation:37 passed,0 failed.npm run self-heal:check:Overall: HEALTHYwith4/4 healthychecks.
Behavioral proof points:
scripts/billing.jsnow emitsbookedRevenueCents,paidOrders,amountKnownCoverageRate,unreconciledPaidEvents, and attribution breakdowns from a dedicated revenue ledger instead of inferring money from stage counts.tests/billing.test.jsproves Stripe booked revenue is summarized truthfully and GitHub Marketplace becomes amount-known when webhook plan pricing is present or explicit plan pricing is configured.tests/api-server.test.jsproves checkout attribution survives the API path and shows up in the admin billing summary.tests/cli.test.jsprovesnode bin/cli.js cfoemits the richer revenue + attribution summary shape.tests/github-billing.test.jsproves GitHub Marketplace purchase events create paid-order records and promote to booked revenue when webhook pricing or plan-pricing config is present.tests/openapi-parity.test.jsandtests/adapters.test.jsprove the machine-readable adapter surface stayed in sync after the summary shape expansion.
Artifacts updated:
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.md
Scope:
- Hardware-aware local embedding profile selection with machine-readable fit evidence.
- Safe fallback embedding profile selection when the primary local profile fails.
- Boosted local risk scorer trained from ThumbGate feedback sequences.
- CLI surface for
model-fit,risk, andprove --target=local-intelligence.
Commands run:
npm ci
node --test tests/cli.test.js
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
npm run prove:local-intelligence
npm run self-heal:checkObserved results:
node --test tests/cli.test.js:20passed,0failed.npm test: all suites pass, including:tests/local-model-profile.test.jstests/risk-scorer.test.jstests/vector-store.test.jstests/feedback-sequences.test.jstests/feedback-loop.test.jstests/prove-local-intelligence.test.js
npm run test:coverage: pass with overall coverage82.86%lines,68.01%branches,86.00%functions.npm run prove:adapters:{ "passed": 21, "failed": 0 }npm run prove:automation:{ "passed": 14, "failed": 0 }npm run prove:local-intelligence:Status: PASSEDnpm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Behavioral proof points:
FIT-01: low-RAM override selects thecompactembedding profile and writesmodel-fit-report.json.FIT-02:vector-storefalls back to the safe embedding profile when the primary profile load fails.RISK-01: feedback capture flow trains and persistsrisk-model.jsonfrom sequence data.RISK-02: analytics expose boosted risk summary withexampleCount=6,mode=boosted, and top high-risk domaintesting.
Artifacts updated:
proof/local-intelligence-report.jsonproof/local-intelligence-report.md
Scope:
- Added first-party Reddit campaign attribution across the live landing page, hosted checkout bootstrap, fallback checkout URLs, billing funnel events, and telemetry analytics.
- Preserved semantic SEO/GEO structure while introducing Reddit-specific campaign messaging and subreddit-aware attribution logic on the public landing page.
- Added operator documentation for Reddit distribution in
docs/REDDIT_GTM_PLAYBOOK.md. - Expanded business analytics so Reddit community, post, comment, campaign-variant, and offer-code performance can be measured end-to-end instead of inferred from raw visit counts.
Commands run:
git diff --check
npm ci
node --test tests/telemetry-analytics.test.js
node --test tests/public-landing.test.js
node --test tests/billing.test.js
node --test --test-concurrency=1 tests/api-server.test.js
node --test tests/dashboard.test.js
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved results:
git diff --check: completed cleanly.npm ci: completed successfully;audited 151 packagesandfound 0 vulnerabilities.- Targeted changed-surface tests:
tests/telemetry-analytics.test.js: passed.tests/public-landing.test.js: passed.tests/billing.test.js: passed.tests/api-server.test.js: passed.tests/dashboard.test.js: passed.
npm test:1070tests,1069passed,0failed,1skipped.npm run test:coverage:1070tests,1069passed,0failed,1skipped; coverage84.14%lines,70.74%branches,86.83%functions.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters:46passed,0failed.env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation:47passed,0failed.npm run self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Behavioral proof points:
public/index.htmlnow classifies Reddit-origin traffic, preservescommunity,postId,commentId,campaignVariant, andofferCode, shows a Reddit campaign banner, and pushes first-partylanding_page_viewtelemetry before checkout.src/api/server.jsnow threads Reddit attribution through/checkout/pro,/v1/billing/checkout, checkout bootstrap telemetry, and hosted success/cancel return URLs without overwriting Stripe checkoutsession_id; visitor-session state is preserved separately viavisitor_session_id.scripts/telemetry-analytics.jsnow reportsbyCommunity,byOfferCode,byCampaignVariant,topCommunity,topOfferCode, andtopCampaignVariantfor page views and CTA events.scripts/billing.jsnow reports acquisition, signup, paid, revenue, and conversion breakdowns by Reddit community, post, comment, campaign variant, and offer code, making first-dollar attribution measurable at the business layer.tests/public-landing.test.js,tests/api-server.test.js,tests/billing.test.js, andtests/telemetry-analytics.test.jsprove the end-to-end Reddit attribution contract from landing click through checkout and analytics summaries.
Artifacts updated:
docs/REDDIT_GTM_PLAYBOOK.md
Scope:
- Added
scripts/agent-readiness.jsto audit runtime isolation, bootstrap context, and MCP permission tiers. - Added
doctorCLI support inbin/cli.js. - Surfaced readiness data in
scripts/dashboard.js. - Added context-pack visibility metadata in
scripts/contextfs.js. - Hardened memex indexing so
constructMemexPack()preserves namespace-aware results. - Fixed the coverage teardown race in
tests/delegation-runtime.test.js.
Commands run:
npm ci
npm test
npm run test:coverage
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:adapters
env THUMBGATE_PROOF_DIR="$(mktemp -d)" npm run prove:automation
npm run self-heal:checkObserved results:
npm ci: passed,0vulnerabilities.npm test: passed.npm run test:coverage: passed with Node test runner coverage summary:- line coverage:
90.25% - branch coverage:
76.67% - function coverage:
93.68%
- line coverage:
npm run prove:adapters: passed with46 passed,0 failed.npm run prove:automation: passed with55 passed,0 failed.self-heal:check:Overall: HEALTHYwith4/4healthy checks.
Behavioral proof points:
doctor --jsonreportsoverallStatus, runtime mode, bootstrap readiness, MCP tier, and article-alignment flags.generateDashboard()exposes readiness truth instead of guessing bootstrap state; the dashboard reflects the repo's actual.mcp.jsonpresence.constructContextPack()andconstructMemexPack()expose visibility metadata including hidden candidate counts, char-budget hits, and visible titles.- Memex pack construction no longer drops relevant entries because namespace metadata is preserved in indexed documents and recovered from
stableRefwhen needed.
Artifacts updated:
README.mdbin/cli.jsscripts/agent-readiness.jsscripts/contextfs.jsscripts/dashboard.jstests/agent-readiness.test.jstests/cli.test.jstests/contextfs.test.jstests/dashboard.test.jstests/delegation-runtime.test.js
Scope:
- Added a portable
npm run test:coveragecommand using Node's built-in coverage fortests/**/*.test.js. - Removed the unused
stripeSDK dependency; billing continues to use direct HTTPS calls inscripts/billing.js. - Synced published version metadata across MCP manifests and public docs to
0.7.1. - Refreshed active proof artifacts and pruned stale milestone-era proof files that were no longer referenced.
Commands run:
npm uninstall stripe
npm test
npm run test:coverage
npm run prove:adapters
npm run prove:automation
node scripts/self-healing-check.js --json > proof/automation/self-healing-health.json
node scripts/self-heal.js --reason=manual > proof/automation/self-heal-run.jsonObserved results:
npm test: pass.npm run test:coverage: pass with Node test runner coverage summary:- line coverage:
81.61% - branch coverage:
67.06% - function coverage:
83.76%
- line coverage:
npm run prove:adapters: pass with21 passed,0 failed.npm run prove:automation: pass with14 passed,0 failed.self-healing-check:Overall: HEALTHYwith4/4healthy checks.self-heal:run:healthy: true, no failing fix steps.
Coverage caveat:
npm run test:coveragemeasurestests/**/*.test.js.- The inline script phases in
test:schema,test:loop, andtest:dpostill run in CI vianpm test, but they are not yet folded into the single coverage percentage.
Artifacts updated:
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.mdproof/automation/self-healing-health.jsonproof/automation/self-heal-run.json
Cross-project Codex startup proof:
cd /Users/ganapolsky_i/workspace/git/igor/trading
codex exec "Print OK only" --skip-git-repo-checkObserved result:
- MCP startup reports
ready: thumbgate, sentry, github, context7, playwright - No
thumbgatetimeout and no MCP handshake error - Command completed with output
OK
Scope:
- Public top-of-funnel checkout endpoint (
POST /v1/billing/checkout) with install correlation metadata. - Append-only funnel telemetry ledger with acquisition/activation/paid stages.
- Admin boundary hardening: billing API keys cannot call admin provision endpoint.
- Funnel analytics endpoint (
GET /v1/analytics/funnel) for conversion evidence. - CLI install correlation (
installId) persisted and linked to acquisition events.
Commands run:
npm run feedback:summary
npm run feedback:rules
npm run self-heal:check
npm test
npm run prove:adapters
npm run prove:automationObserved results:
self-heal:check:Overall: HEALTHYwith4/4healthy checks.npm test: all suites pass; key monetization checks verified in:tests/api-server.test.jstests/billing.test.jstests/cli.test.jstests/openapi-parity.test.js
npm run prove:adapters:{ "passed": 21, "failed": 0 }npm run prove:automation:{ "passed": 14, "failed": 0 }
Behavioral proof points:
- Public checkout succeeds without bearer auth and emits acquisition event.
- First authenticated billing-key usage emits exactly one activation event.
- Stripe and GitHub billing flows emit paid-stage funnel events.
- Static admin token is required for
POST /v1/billing/provision; billing keys receive403. - OpenAPI canonical + ChatGPT adapter include billing and funnel analytics routes with parity checks.
Artifacts updated:
proof/compatibility/report.jsonproof/compatibility/report.mdproof/automation/report.jsonproof/automation/report.md