Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

 

History

History
 
 

README.md

Docker E2E tests

Integration tests that prove the published plugin tarballs actually load and work end-to-end inside a clean Linux environment with real OpenCode / Pi binaries plus a mock LLM (aimock).

What this covers

These tests sit above the in-process Bun e2e suite (packages/e2e-tests/) which hits the plugin pipeline directly. The docker layer covers the seam the in-process tests can't reach:

  • Real binaries — the actual opencode and pi binaries users install, not a Bun-spawned mock harness
  • Real install pathbunx --bun ...@latest doctor --force against an empty home directory, the same command users run after first install
  • Real OS — Debian bookworm, the most common deployment target after macOS
  • Real native modulesbetter-sqlite3 rebuilt for Linux x64, @huggingface/transformers resolution, etc.
  • Cross-harness shared SQLite — both harnesses point at the same ~/.local/share/cortexkit/magic-context/context.db and write distinct harness rows

What's intentionally not covered here (already exercised by packages/e2e-tests/): historian compartments, recomp, dreamer scheduling, memory consolidation, complex tool-call patterns, Anthropic-specific cache-token semantics, overflow recovery. Those tests need precise control over message shapes and provider responses which is much faster in-process than through aimock.

Layout

tests/docker/
├── Dockerfile.opencode          # Debian + Node + Bun + OpenCode + aimock
├── Dockerfile.pi                # Debian + Node + Bun + Pi + aimock
├── test-opencode-e2e.sh         # 2-phase test: SETUP_SMOKE + SESSION_SMOKE
├── test-pi-e2e.sh               # 2-phase test: SETUP_SMOKE + SESSION_SMOKE
├── fixtures/
│   ├── aimock-opencode.cjs      # Mock LLM fixture for OpenCode session smoke
│   └── aimock-pi.cjs            # Mock LLM fixture for Pi session smoke
└── run-e2e.sh                   # Local runner: builds + runs both images

Running locally

# Both harnesses
tests/docker/run-e2e.sh

# Just one
tests/docker/run-e2e.sh opencode
tests/docker/run-e2e.sh pi

The runner pre-builds the local plugin dists (the Dockerfiles COPY from packages/*/dist/ rather than building inside the image — keeps iteration fast).

Requires Docker with Linux/amd64 emulation. On Apple Silicon, this means --platform linux/amd64 (the runner sets it automatically); first run will pull qemu-user-static if you haven't built linux/amd64 images before.

Running in CI

.github/workflows/e2e-docker.yml runs both jobs on:

  • pushes to master
  • pull requests touching packages/plugin/**, packages/pi-plugin/**, or tests/docker/**
  • v* tag pushes (release gate)
  • manual workflow_dispatch

Test phases

Each container runs two phases in sequence:

Phase 1 — SETUP_SMOKE

Starts from a clean home directory. Runs the non-interactive doctor --force flow, which is what we publish as the "I just installed, fix me up" command. Asserts:

  • doctor exits with Doctor (complete|repair complete) summary
  • the harness-specific config file gets created
  • the plugin entry gets registered
  • doctor reports FAIL 0 failures
  • (Pi) doctor confirms Pi version meets the >= 0.71.0 floor

Phase 2 — SESSION_SMOKE

Layers a minimal magic-context.jsonc and an aimock-pointed provider config on top, then runs a single agent turn (opencode run "..." or pi --print "..."). Asserts:

  • aimock responds to /v1/models
  • the agent binary completes within 60s
  • the Magic Context plugin log is non-empty
  • the shared SQLite DB exists
  • at least one tags row was written with the matching harness value
  • at least one session_meta row was written with the matching harness value

If both phases pass, the container exits 0; otherwise it exits 1 and the script prints which check failed.

Adding a new test

For a new always-on assertion, add a check line to the appropriate test-*-e2e.sh:

check "label that names what's being verified" \
    "test -f /path/that/should/exist"

For a new mock-LLM behavior, add another mock.on({...}, {...}) block to the matching fixtures/aimock-*.cjs. See aimock docs for the response shape.

For deeper scenarios (multi-turn, historian publication, dreamer), prefer adding to packages/e2e-tests/ instead — the in-process harness is much faster to iterate on and has tighter control over message shapes than aimock does.