Frankenstein is the only compiler that combines formally verified reference counting (Perceus, K-proven), four real compiler frontends stolen from production compilers (GHC, rustc, mmc, Koka), algebraic effects as the unified cross-language abstraction, and MLIR as the backend. This roadmap charts the path from working prototype to research contribution.
Goal: A single binary where 4 functions in 4 languages compose through shared algebraic effects. This is the demo that makes people understand what Frankenstein is.
Write a program where:
- Haskell defines a pure recursive function (e.g., fibonacci)
- Rust defines an accumulator with affine ownership
- Mercury defines a search predicate with backtracking (choice effect)
- Koka defines the main with algebraic effect handlers
All four compile through their bridges to OrganIR, link via the polyglot linker, and produce a single native binary via MLIR.
Prerequisites: The linker name-rewriting (done), evidence pass for cross-module effects, MLIR emitter handling all four bridges' output patterns.
Deliverable: frankenstein fib.hs accumulate.rs search.m main.kk --compile && ./a.out
Create examples/polyglot-demo/ with the 4-language program above plus
a test script that:
- Compiles each file individually through its bridge (
--emit-core) - Links them together
- Compiles to native
- Verifies output against K oracle (see Phase 2)
Goal: Make K Framework the source of truth for the entire IR, not just Perceus. Every transformation verified against a formal model.
organ-ir.k (1229 lines, 240 rules) is a complete executable semantics:
- Evaluation rules ✓:
EApp,ELam(closure-based beta-reduction),ELet,ECase(pattern matching with PatLit/PatCon/PatVar/PatWild),EDelay/EForce(thunk capture/memoization),ETypeApp/ETypeLam(type erasure) - Effect semantics ✓:
EPerformsearches the handler stack via delimited continuation capture,EHandlepushes/pops handlers on<effectStack>, full abort and resume (tail-resumptive) patterns, nested handler support - Memory model ✓: Perceus operations (
ERetain,EDrop,ERelease,EReuse) as semantic no-ops in the reference interpreter, store-based variable binding - Builtins ✓: Arithmetic (
+,-,*,/,mod), comparisons (==,<,>,<=,>=), string concat (++), negate - 118 krun shell tests passing (typing, free vars, usage counting, Perceus transforms, evaluation, effects, Mercury semidet/choice patterns, bridge properties)
test/KOracle.hs (479 lines):
- 4 QuickCheck properties: random pure, arithmetic, let+case, effect expressions
- Generators:
genArithExpr,genLetCaseExpr,genEffectExpr - Compares
krun(eval(expr))against MLIR pipeline → native binary output - Integrated into
cabal test(13 krun tests in test suite, all passing)
test/BridgeBisim.hs: 8 bridge test suites verifying translation preserves
observable behavior:
- GHC bridge:
krun(translateGHC(Arith.hs)) == 44, factorial == 3628800, krun == ghc native comparison - Koka bridge:
krun(translateKoka(arith.kk)) == 44 - Rust bridge:
krun(translateRust(arith.rs)) == 44, krun == rustc native - Mercury bridge: structural tests (OrganIR defs produced)
- Python bridge:
krun(translatePython(arith.py)) == 44 - Go bridge:
krun(translateGo(arith.go)) == 44 - Futhark bridge:
krun(translateFuthark(arith.fut)) == 44 - Scheme bridge: structural tests (main Def, call/cc)
313 claims across 5 files, verifiable via k-specs/tests/run-kprove.sh:
- Perceus claims (43): free var analysis, drop insertion, retain for multi-use, lambda scope drops, identity properties
- Evidence claims (30): no EHandle/EPerform post-pass, single-op/multi-op projection with evv_select, unhandled-effect fallthrough
- Bridge claims (155): GHC (lazy/strict/forall), Koka (constructor rewriting), Rust (ownership/affine), Mercury (determinism mapping), Python, Go, Futhark, Scheme
- Linker claims (44): name rewriting preserves local scope, main unmangled, mangling deterministic, cross-module resolution
- EffectOpt claims (41): identity handler detection, tail-resumptive detection, handler inlining substitution correctness
Goal: Compile a non-trivial Haskell program through Frankenstein and run it without a garbage collector, purely on Perceus reference counting + thunks. If performance is even remotely competitive, this is a paper.
- End-to-end proven:
Factorial.hs→ GHC bridge → Perceus → MLIR → native → 2432902008176640000 - Profile (factorial 20):
- Binary: 14 KB vs GHC's 26 MB (1860x smaller)
- Speed: 2.4ms vs 5.5ms per run (2.3x faster, dominated by startup)
- RC ops: 42 retain calls, all no-ops (unboxed integers skip
kk_is_heap_ptr) - Heap: 0 allocations, 0 drops, 0 frees — pure stack computation
- Pain points identified:
- Circular references: Haskell's lazy
let rec(e.g.,xs = 1 : xs) creates cycles that RC alone cannot collect. Needs trial deletion or weak refs. - Thunk chains: Deep lazy evaluation (e.g.,
foldl (+) 0 [1..10^6]) builds O(n) thunk chains. Each thunk is heap-allocated with RC=1; forcing triggers a cascade of allocations. Not a leak, but high allocation pressure. - Typeclass dictionaries: GHC desugars
show,+,==etc. to dictionary-passing. Currently erased by the bridge. Real programs need dictionary structs with RC. - Sharing via laziness: Haskell relies on thunk memoization for sharing
(
let x = expensive in (x, x)computes once). Current thunk impl supports this, but multi-reference thunks need correct retain/drop around force. - Unboxed vs boxed: Current pipeline treats all values as i64. Real Haskell uses
Int#(unboxed) andInt(boxedI#wrapper). The I# simplification helps but algebraic data (lists, trees) needs heap boxing with proper RC.
- Circular references: Haskell's lazy
All four GHC Core patterns now compile through the pipeline:
- Typeclass dictionaries ✓: GHC at
-O1resolves dictionaries to concrete method selectors ($fNumInt_$c+). OurisDictArgfilter strips dictionary args, method selectors map to builtins.double(21) = 42via typeclass(+). - Unboxed types ✓: GHC's
$wworkers operate onInt#with primops (+#,-#,<#,==#). MLIR emitter now handles#-suffixed primops.sumTo(100) = 5050. - Worker/wrapper ✓: GHC splits into strict workers and lazy wrappers. Workers are
Recbindings (direct functions), wrappers are filtered.fib(10) = 55via$wfib. - Join points ✓: GHC at
-O1compiles guards/nested patterns to cascaded cases with primop comparisons.classify(-5) + classify(0) + classify(42) = 0. - Key fixes: Don't thunk lambdas (
isLambdacheck indecideLaziness), Bool→i64 consistently, GHC primop name recognition (+#,-#,<#,==#,negate), Num method selectors ($fNumInt_$c+/*/negate).
- Bacon-Rajan trial deletion ✓:
runtime/kk_cycle.cimplements the synchronous cycle collector — MarkRoots (trial-delete internal refs), ScanRoots (identify live vs garbage), CollectRoots (free white objects). Uses color encoding in high byte of refcount word (black/purple/gray/white). - Runtime integration ✓:
kk_drop()registers cycle candidates when rc > 0 after decrement.kk_alloc_con()registers nfields in side table for child scanning. Recursive child dropping on free. Existing programs unaffected. - Static cycle analysis ✓:
Core/CycleAnalysis.hsdetects potential cycle sources (constructor applications capturing self-references). Reports in--emit-coreoutput. All current test programs correctly identified as acyclic. - C test suite ✓: 10/10 tests passing — acyclic data, cyclic pairs (A↔B), cyclic triples (A→B→C→A), self-reference (A→A), mixed acyclic+cyclic.
- K tests ✓: 8 new krun tests for constructor allocation, retain/drop of heap objects, thunk semantics (force/delay), and RC operations on constructed data.
Three pure-integer benchmarks compiled through all four compilers: fibonacci(42),
tak(24,16,8), ack(3,8). All 12 binaries verified correct. Automated benchmark script
(bench/run.sh) measures wall time, peak RSS, and RC profile counts.
Binary sizes (Frankenstein 680x smaller than GHC):
| Compiler | fib | tak | ack |
|---|---|---|---|
| Frankenstein | 38.0 KB | 38.0 KB | 38.0 KB |
| GHC -O2 | 25.9 MB | 25.9 MB | 25.9 MB |
| Rust -O | 9.1 MB | 9.1 MB | 9.1 MB |
| Koka -O2 | 8.4 MB | 8.4 MB | 8.4 MB |
Wall time (median of 5 runs, after unboxed elision):
| Compiler | fib(42) | tak(24,16,8) | ack(3,8) |
|---|---|---|---|
| Frankenstein | 1.88s | 0.01s | 0.01s |
| GHC -O2 | 3.20s | 0.02s | 0.01s |
| Rust -O | 1.72s | 0.01s | 0.02s |
| Koka -O2 | 3.04s | 0.02s | 0.03s |
Peak RSS (Frankenstein uses least memory, zero heap):
| Compiler | fib | tak | ack |
|---|---|---|---|
| Frankenstein | 1876 KB | 2024 KB | 2020 KB |
| GHC -O2 | 3436 KB | 3620 KB | 3744 KB |
| Rust -O | 1964 KB | 2008 KB | 2068 KB |
| Koka -O2 | 2816 KB | 2784 KB | 2776 KB |
Frankenstein RC profile (zero RC ops after unboxed elision):
| Benchmark | retain | drop | alloc | reuse |
|---|---|---|---|---|
| fib(42) | 0 | 0 | 0 | 0 |
| tak(24,16,8) | 0 | 0 | 0 | 0 |
| ack(3,8) | 0 | 0 | 0 | 0 |
Key findings:
- Binary size: Frankenstein produces 38 KB binaries (kk_runtime + kk_arena + kk_cycle). GHC statically links its RTS (25.9 MB), 680x larger.
- Memory: Frankenstein uses the least memory (1.9 MB) — all computation is pure stack, zero heap allocations. No GC pauses, no allocation pressure.
- Speed: Frankenstein matches Rust on fib(42) (1.88s vs 1.72s) and beats both GHC (3.20s) and Koka (3.04s). On tak/ack all four compilers are within measurement noise.
- Unboxed elision (commit faa5319): The Perceus pass now skips retain/drop for
values of known-unboxed types (Int, Char, Bool, Word, etc.) via
isUnboxedType. This eliminated 1.73 billion no-op retain calls on fib(42), reducing wall time from 22.88s to 1.88s — a 12.2x speedup. The RC profile shows zero runtime overhead. - Codegen fixes during benchmarking: Multi-arg lambda collection (GHC bridge),
nameToSsafor unique SSA names (MLIR emitter) — both needed for multi-param GHC workers (tak, ack).
Goal: Instead of lowering effects to evidence-passing in Haskell, define
a first-class frankenstein MLIR dialect. MLIR's pass infrastructure can
then optimize effect dispatch at the IR level.
Three new MlirOp constructors in Dialects.hs:
FrankHandle effect handler_ssa body_ssa—"frankenstein.handle"with effect attributeFrankPerform effect op [arg_ssas]—"frankenstein.perform"with effect/op attributesFrankResume arg_ssa—"frankenstein.resume"for continuation
Rendered as MLIR generic syntax (works with --allow-unregistered-dialect):
"frankenstein.handle"(%handler) {effect = "exn"} // body result: %result
"frankenstein.perform"(%arg) {effect = "exn", op = "raise"} : (i64) -> i64
"frankenstein.resume"(%val) : (i64) -> i64New --emit-effect-mlir CLI flag emits MLIR without running the evidence pass,
so EHandle/EPerform nodes appear as frankenstein.* dialect ops.
emitProgramWithEffects function in Emitter.hs handles effect-dialect mode via
esEffectDialect flag in EmitState.
Three Core IR → Core IR transformations in EffectOpt.hs, run before evidence pass:
- Handler inlining (
inlineLocalHandlers): WhenEHandle eff (ELam ...) bodycontainsEPerform eff argsin the body, inline the handler at each perform site. Eliminates the dynamic handler dispatch overhead entirely. - Identity handler elimination (
eliminateIdentityHandlers): Detects handlers of the form\x k -> k(x)(both curried and uncurried) and removes theEHandlewrapper — the handler is a no-op. - Tail-resumptive detection (
annotateTailResumptive): Detects handlers where every control path ends with a call to the resume continuation. These handlers can be implemented as direct function calls without continuation capture.
Evidence specialization (Phase 4b plan item) is already handled by the existing evidence pass, which directly binds known handler functions.
Statistics: effectOptimizeWithStats returns counts of inlined, eliminated, and
tail-resumptive handlers detected.
The existing evidence pass IS the lowering from frankenstein.* to standard MLIR:
frankenstein.perform→ evidence vector lookup + indirectfunc.callfrankenstein.handle→ push evidence (let-bind) + evaluate body + popfrankenstein.resume→ call continuation (function pointer in evidence)
Pipeline: --emit-effect-mlir shows frankenstein.* ops; --emit-mlir shows
the lowered form; --compile runs the full pipeline through to native code.
- New files:
src/Frankenstein/Core/EffectOpt.hs(~280 lines, 3 optimization passes) - Modified:
Dialects.hs(3 new ops + rendering),Emitter.hs(effect-dialect mode),Main.hs(--emit-effect-mlirflag,effectOptimizeintegration),frankenstein.cabal - Tests: 7 new tests (identity handler elimination, stats, dialect emission)
- Total test suite: 46 cabal tests (39 existing + 7 new)
- Regression:
--demo --compile→ 3628800 still works
Goal: MLIR → Wasm, turning Frankenstein into a polyglot-to-web compiler.
Pipeline: MLIR → mlir-opt → mlir-translate --mlir-to-llvmir → llc -mtriple=wasm32 → wasm-ld → .wasm
New CLI flag: --target wasm32 (used with --compile or --emit-mlir)
Key implementation details:
CompileTargettype (TargetNative | TargetWasm32) inEmitConfigemitProgramWasm: MLIR emission without printf/main wrapper (Wasm host reads return value)compileToWasm: full pipeline from Core IR to.wasmbinary- Wasm runtime (
runtime/kk_runtime_wasm.c): freestanding Perceus RC with 1MB static bump allocator, no libc dependencies. Values are i64 (Wasm natively supports i64), pointers are i32 (wasm32 linear memory). - Binary size: 485 bytes for factorial demo (vs 18.6KB native)
Validation script (test/wasm/validate_wasm.sh) verifies:
- Demo factorial compiles to
.wasmand returns 3628800 in Node.js - Native output matches Wasm output (cross-target comparison)
- Wasm binary is under 10KB
KWasm (K framework Wasm semantics) integration is prepared but requires KWasm
installation. The validation currently uses Node.js as the Wasm execution engine.
Future: kwasm run to close the formal verification loop (source→K oracle ↔ binary→KWasm).
web/index.html: single-page demo that loads pre-compiled .wasm and runs
factorial(10) in the browser via WebAssembly.Instance. Shows result, execution
time, binary size, and exported function count.
Build with: bash web/build.sh → serves via python3 -m http.server 8080
- New files:
runtime/kk_runtime_wasm.c(freestanding Wasm runtime),test/wasm/validate_wasm.sh,web/index.html,web/build.sh - Modified:
Emitter.hs(CompileTarget,emitProgramWasm,compileToWasm),Main.hs(--target wasm32),KOracle.hs(ecTarget field) - Tests: 4 new Wasm emission tests + 3 validation tests (script)
- Total test suite: 50 cabal tests (46 + 4 Wasm)
- End-to-end:
--demo --compile --target wasm32→ 485-byte.wasm→ Node.js → 3628800
Goal: Feed Frankenstein's own Haskell source through the GHC bridge. Even partial self-hosting is a dramatic proof of capability.
Core/Types.hs is a pure data definition module with no IO — the simplest
target. Done: 35 record selector functions translate cleanly through
GHC bridge → Frankenstein Core → MLIR → mlir-opt validation. 867-line MLIR
output for the entire module.
Bridge fixes required:
- Filter
$krepruntime-type-rep bindings (joined existing$tc/$trfilter) - Recognize
I#(var)boxing as identity (Int and Int# share i64 representation) - GHC bridge
Driver.hs: import paths (src/,.), enableOverloadedStringsto matchfrankenstein.cabaldefault-extensions - Added
ghc-boot-thtobuild-dependsforGHC.LanguageExtensions.Type
Emitter fixes required:
emitPatFieldnow registers field bindings inesAliasesso subsequentEVarreferences resolve correctly (was previously a comment-only no-op)- New
SingleConCasebranch class for exhaustive single-constructor cases: emits field extraction + body inline without anscf.if(avoids referencing field SSA values from a sibling region) sanitizeNamenow strips parens, commas, brackets, quotes, whitespace
The Perceus pass itself, compiled through the Perceus pass. Beautifully
recursive. Status: Translates to Frankenstein Core → MLIR (~3550 lines)
in one shot. All 50 tests still pass, --demo --compile still produces
3628800. Remaining constraints are known and narrow (7 cross-region SSA
references to drops/retains emitted outside the scf.if region that defines
the value).
Emitter changes required:
- Closure ABI via
kk_alloc_con: every lifted lambda allocates a heap closure; field 0 is the function pointer as i64, fields 1..n are the captured variables. Closures flow through HOF contexts as plain i64, so no MLIR struct values leak into the generici64pipeline. - Closure-indirect call path:
EApp (EVar fn) argsnow checksesTopFnsto decide betweenfunc.call @fn(args)(direct call to a known top-level function) and a closure-indirect call that extracts field 0 viakk_field, inttoptr's it, andllvm.calls through the pointer.esTopFns :: Set Textis seeded from the program's def names. - Unresolved external fallback: names that aren't in scope and aren't
known top-level functions (imports from
Data.Map,Data.Set, data constructors like(,)or:) materialize as stub constants with an// unresolved externalcomment — the MLIR stays well-formed even though the call is semantically undefined. Applied at bothEVarsites and the closure-call path. - Top-level-fn-as-value: when a known top-level function is used as
a value (e.g. passed as an argument), emit
llvm.mlir.addressof @fnllvm.ptrtointso the caller gets a real i64 address rather than a dangling SSA name.
- Lambda parameter renaming: every captured var and regular param
gets a fresh SSA name via
freshName, with aliases save/restore around the body, preventing collisions between captured and bound names with the same OccString. - Uniform i64 params in lifted functions: the closure ABI is uniform
i64 for all arguments, so lifted lambda signatures use
i64rather thantypeToMlir(which was producing!llvm.ptrfor some Haskell types and failing when consumers expected i64). llvm.insertvalueoperand order: fixed to (value, container) in both existing callsites (was backwards — MLIR rejects the reversed form as a type mismatch on the struct container).- Capture filter: only names currently in
esAliasesare captured. External references (unresolved imports, top-level fn names) are handled at the reference site instead of being dragged into the closure, avoiding raw unsanitized names like%:or%foldrinkk_set_fieldcalls.
Feed all 18 Haskell modules in src/Frankenstein/ through the GHC bridge to MLIR.
Result: 3 modules (Types, KokaCore, KokaBridge.Driver) emit fully valid MLIR
that passes mlir-opt --allow-unregistered-dialect with zero errors. 14 modules
emit MLIR with 1–12 remaining errors out of thousands of lines each (e.g., Emitter
at 43,581 lines has just 1 remaining error; Linker at 14,344 lines has 5).
Only 1 module is a hard blocker: OrganIR.Consumer fails at the bridge layer
due to an external text-2.1.3 vs text-2.1.4 version conflict between our
session and the organ-ir package — outside our control. All 50 cabal tests
still pass and --demo --compile still produces 3628800.
GHC bridge Driver changes:
- Module matching:
runGhcCompilenow searches the module graph for the module whoseml_hs_filematches the input path, instead of taking the head (which picked an arbitrary dependency when the target imported other modules) - Language extensions: enable
LambdaCase,BangPatterns,TupleSections,ScopedTypeVariables,Derive{Functor,Foldable,Traversable},GeneralizedNewtypeDeriving,Flexible{Contexts,Instances},RecordWildCards,NamedFieldPuns,MultiParamTypeClassesto matchfrankenstein.cabal - Package visibility: expose
ghc,koka,organ-irpackages viapackageFlagsso our own modules can self-host - Package DBs: add cabal store + dist-newstyle package DBs via
packageDBFlagsso inplace builds of our dependencies are discoverable
Emitter changes required:
- Top-level param aliases:
emitDefnow installs function parameters as identity entries inesAliasessoEVarlookups find them (was relying on implicit SSA name matching before the alias-lookup rewrite) scf.ifbranch alias scoping:emitConChainnow saves/restoresesAliasesaround each case arm so pattern-bound field SSA values don't leak across sibling regions — the single biggest fix, dropping most modules from 100+ errors to 1–12- Thunk with captures:
EDelaybodies with captured free variables are inlined (degraded laziness) rather than lifted to a zero-arg thunk that can't reach the captures - MLIR string escaping:
escapeMLIRStringnow hex-escapes all non-printable chars viaprintf "\%02X"— raw control bytes would otherwise break MLIR's string literal parser func.constant+unrealized_conversion_cast: closure function pointers and thunk bodies usefunc.constant @fn : tythen cast to i64, becausellvm.mlir.addressofrejectsfunc.funcreferences.mlir-opt --reconcile-unrealized-castscleans these up after inlining
The remaining 1-error-per-module pattern is partial application of top-level functions (callee has N params, call site supplies M<N args) — would require eta-expanding partial applications into closures. Left as future work.
Result: 17 of 18 modules emit fully valid MLIR. The 18th, OrganIR.Consumer,
fails earlier in the GHC frontend due to a pre-existing text-2.1.3/2.1.4
package skew unrelated to MLIR emission.
Emitter changes:
- PAP closures:
emitPapClosureallocates a heap closure viakk_alloc_con(field 0 = wrapper fptr, fields 1..n = supplied args).ensurePapWrapperemits a per-(fn, supplied-arity) wrapper that loads captured args from the closure and tail-calls the original. Fires whennArgs < arityat a top-level call site. - Oversaturated path: when
nArgs > arity, call the top-level fn with the firstarityargs, then closure-indirect the remainder via field-0 fptr extraction (kk_field/inttoptr/llvm.call). - Uniform i64 ABI: all top-level fn params and return types are i64,
matching the closure ABI and avoiding
!llvm.ptrleaking into kk_* runtime calls and PAP wrappers. - String literals → i64:
ELit (LitString)immediatelyptrtoints the global address so it lives in the i64 universe. - ELet alias scoping: save/restore
esAliasesaround the let body so bindings don't leak into siblingscf.ifbranches as undeclared SSA refs. ETypeLamstripping inemitDef: matchesbuildTopFnArityso emitted arity equals the call-site arity table (fixedKokaBridge.CoreTranslate'sfoldlMmismatch).- No
llvm.unreachablein unhandled-case fallback: it was illegal as a non-terminator insidescf.ifregions.
Result: All 17 self-hosted modules lower cleanly through
mlir-opt → mlir-translate → clang -c and produce real ELF objects, totalling
~1.8 MB. The biggest is Emitter.o at 702 KB.
| Module | .o size |
|---|---|
| Core/Types | 13 KB |
| Core/CycleAnalysis | 24 KB |
| KokaBridge/Driver | 32 KB |
| GhcBridge/Driver | 37 KB |
| Core/KokaCore | 43 KB |
| Core/Evidence | 43 KB |
| Core/Perceus | 45 KB |
| MlirEmit/Dialects | 47 KB |
| Core/EffectOpt | 50 KB |
| GhcBridge/CoreTranslate | 53 KB |
| RustBridge/CoreTranslate | 58 KB |
| MercuryBridge/CoreTranslate | 58 KB |
| KokaBridge/CoreTranslate | 80 KB |
| MercuryBridge/HldsParse | 119 KB |
| Core/Linker | 188 KB |
| RustBridge/MirParse | 237 KB |
| MlirEmit/Emitter | 702 KB |
Pipeline:
frankenstein <file.hs> --emit-mlir
| mlir-opt --allow-unregistered-dialect --reconcile-unrealized-casts
--convert-scf-to-cf --convert-arith-to-llvm
--convert-cf-to-llvm --convert-func-to-llvm
--reconcile-unrealized-casts
| mlir-translate --mlir-to-llvmir
| clang -c -o file.o
Emitter fix:
func.constant→ i64 via!llvm.ptr: every closure-fptr cast now goesfunc.constant @fn : (...) -> ty→unrealized_conversion_castto!llvm.ptr→llvm.ptrtointto i64. The previous one-shotunrealized_conversion_castto i64 left a function-typed cast thatreconcile-unrealized-castscouldn't erase, somlir-translaterejected the IR with "LLVM Translation failed for operation: builtin.unrealized_conversion_cast". Going via!llvm.ptrlets--convert-func-to-llvmrewrite thefunc.constanttollvm.mlir.addressofand reconcile then folds the redundantptr → ptrcast.
End-to-end runnable validation: linked self-hosted Core/Types.o against the
C runtime (kk_runtime.c + kk_cycle.c) and a small driver, called the
frankenstein-compiled bindName/bindExpr record selectors on a heap-allocated
Bind value, and got back the correct field values. Frankenstein has now
bootstrapped a piece of itself end-to-end: source → GHC bridge → Core IR →
Perceus → MLIR → LLVM IR → ELF object → executed in process.
Result: Self-hosted compiler runs 21 end-to-end examples, including
standard library types ([Int], Maybe, Bool, tuples), Prelude HOFs
(map, filter, sum), strings, cross-module compilation, and algebraic
effects. All 23 modules compile to native objects and link into a working
compiler binary.
Goal: The self-hosted compiler produces correct MLIR for a nontrivial
program. Result: factorial(10) compiled through the self-hosted emitter
yields 3628800. Pipeline: Core IR (C) → self-hosted emitProgramText → MLIR
→ mlir-opt → mlir-translate → clang → native binary → 3628800.
67 self-tests pass across 14 modules exercising all 4 compiler passes (assignProgramTags, insertPerceus, evidencePass, emitProgramText).
Goal: 21 example programs compile correctly through the stage 1 self-hosted compiler. Result: 21/21 pass.
| Example | Output | Features |
|---|---|---|
| nested | 60 | Nested ADTs |
| maybesum | 42 | Custom Maybe |
| listsum | 15 | Custom list |
| tree | 6 | Binary tree |
| alloc_stress | 100100000 | Heavy allocation + RC |
| closure | 42 | Closures |
| mutual_rec | 5 | Mutual recursion |
| multi_adt | 317 | Multiple ADTs |
| higher_order | 12 | Higher-order functions |
| exhaust_tail | 36 | Exhaustive patterns |
| stdlib_list | 15 | Standard [Int] |
| stdlib_maybe | 141 | Standard Maybe Int |
| stdlib_bool | 7 | Guards, Bool |
| stdlib_tuple | 13 | Standard (Int, Int) |
| prelude_hof | 22 | map/filter/foldr |
| prelude_inline | 24 | Prelude inlined HOFs |
| prelude_comprehensive | 235 | take/zipWith/foldl |
| stdlib_string | 11 | String = [Char] |
| cross_module | 45 | Multi-module GHC bridge |
| effect_ask | 84 | Algebraic effect (ask) |
| effect_state | 100 | Algebraic effect (state) |
Goal: Three-stage bootstrap — host compiler → stage 1 → stage 2 → stage 3 — with the self-hosted compiler reaching a fixed point (stage 2 and stage 3 produce byte-identical MLIR for all 23 modules).
All 23 modules compile through the stage 1 self-hosted compiler to produce
stage 2 MLIR → LLVM IR → native objects. Large modules (>1MB OrganIR JSON)
are automatically split by size (400KB target per part), compiled separately,
and merged. JSON is minified (separators=(',',':')) to reduce parser load.
Pre-processing (Core IR, host compiler):
Frankenstein.Core.NormalizePatterns— convertsPatCon True/FalsetoPatLit 1/0and appendsPatWilddefaults to exhaustive multi-constructor cases. Replaces the formerfix-bool-patterns.pyscript (1,584 fixes/run).
Post-processing pipeline for stage 2/3 MLIR:
The three Python scripts that previously patched self-host runtime
divergences have been absorbed into a single Haskell module
Frankenstein.MlirEmit.PostProcess, invoked via the host compiler's
--postprocess-mlir CLI flag. Architectural immunity: this code runs in
the GHC-compiled host binary, never in the self-hosted compiler, so
the runtime bugs it works around cannot affect it.
Active post-processing:
Frankenstein.MlirEmit.PostProcess(Haskell) — replaces fix-intra-module-calls / fix-dollar0-refs / fix-mlir-arity. Seek-specs/postprocess-claims.kfor the formal immunity claim.merge-mlir-parts.py— deduplicatefunc.funcacross split parts (inherent to split compilation, not a divergence workaround)extract-mlir-funcs.py— per-part fallback extraction when a split part crashes (inherent to split-compile robustness)
Eliminated post-processing scripts (8 of 8 divergence-workaround scripts):
fix-bool-patterns.py→ HaskellNormalizePatternspassfix-captures.py→ dead code (already-fixed lambda-lift bug)fix-fld-refs.py→ dead code (superseded byA_sanitize_shim.c)fix-missing-else.py→ dead code (already-fixed emitter bug)fix-orphan-decls.py→ Haskell EFunRef declaration in emitterfix-intra-module-calls.py→ HaskellPostProcess.fixIntraModuleCallsfix-dollar0-refs.py→ HaskellPostProcess.fixDollar0Refsfix-mlir-arity.py→ HaskellPostProcess.fixMlirArity
Bootstrap fixed point: 24/24 modules match between stage 2 and stage 3, 21/21 E2E tests pass at every stage.
Stage 2 compiler binary: ~6.3 MB. Links against the same C runtime (kk_runtime.c, kk_cycle.c, shim_*.c, text shims, JSON parser).
Result: 21/21 end-to-end examples pass through the stage 2 compiler, producing identical outputs to the host compiler and stage 1.
Result: All 23 modules compiled through the stage 2 compiler produce byte-identical MLIR to stage 2 output. The self-hosted compiler has reached a fixed point — it reproduces itself.
Pipeline: same 23 OrganIR JSON files → stage 2 compiler → stage 3 MLIR, with the same 7 post-processing scripts applied. Stage 3 binary linked and passes all 21 E2E tests.
Key fix for convergence: The monolithic emitAppVar function (~490 lines,
936KB OrganIR definition) caused the self-hosted compiler to hang indefinitely
during compilation. Refactored into 6 arity-dispatched functions (emitAppVarWith0
through emitAppVarWith3 + emitAppVarGeneral), each producing a manageable
OrganIR definition that compiles in seconds.
BOOTSTRAP FIXED POINT: host → stage 1 → stage 2 → stage 3, all 23/23 modules match, all 21 E2E tests pass at every stage.
Strict s2≡s3 fixed point: ACHIEVED 2026-05-19 (was 9-11/26 historically).
Bootstrap reports *** FIXED POINT REACHED *** — 26/26 byte-equal, 21/21 E2E
at every stage. Resolution: clean rebuild fixed driver.o staleness (9-11/26 →
19/26), then four PostProcess.hs bug fixes resolved the remaining 7 stage-3
emit failures (commit 9cd995f). See docs/strict-fixed-point.md for
details.
-
Self-host runtime: pattern-match dispatch on ADT constructors is systematically wrong — still latent, still worked around by PostProcess. Minimum reproducer saved at
examples/db7_reproducers/wrong_default_body.hs(8 lines). Source-level refactor ofclassifyBranches.defaultBranchwas attempted 2026-05-19 (case-of-list →filter+null+head) — caused massive regression (26/26 → 3/26, 21/21 → 6/21), reverted. Confirms the original observation: every refactor attempt triggers a different manifestation. The bug is in compiled-output shape fragility, not in any one Haskell construct. Future fix needs MLIR-level instrumentation ofclassifyBranchesitself, not source-level rewrites. Thetools/diff-tester/--mode=host-runtime-vs-stage2-runtimedriver auto-finds new instances. Three observable consequences (all worked around by PostProcess, which runs in the host binary and is architecturally immune):emitConChain'smDefaultExprflips Nothing → Just, emitting dead-code references to out-of-scope pattern binders.- Split-compile
$Nexternals not resolving (each part only sees its ownesTopFns). - Lambda-lift capture-dropping at some
func.callsites.
-
sanitizeNamecorruption: Root cause identified —T.concatMap encodeCharin the self-hosted binary non-deterministically corrupts characters (closure dispatch or UTF-8 iteration). Mitigated byA_sanitize_shim.c. -
BRIDGE_haskell_strings:
main :: String,main = putStrLn "...", and chaineddo { … }blocks all print natively.main :: Stringroutes throughkk_println_haskell_chars(cons-list walker using the hash-based tagsstableConTag "[]" = 31636andstableConTag ":" = 46589).putStrLn(which GHC inlines tohPutStr2 stdout list TruefromGHC.Internal.IO.Handle.Text) is intercepted byghcIoOutputRuntimeinFrankenstein.GhcBridge.CoreTranslateand rewritten to a directprintln_haskell_chars/print_haskell_charscall. Multi-stepdo-blocks work because GHC keeps an explicit\s -> ...lambda binding the state token;applyMainIfFunctionAliasdetects when the user'smainis just anEVarreference to a lambda-bodied helper and rewrites the alias to apply it with a dummy state arg. Codepoints127 are now UTF-8 encoded on output (
café,한국어,🎉all round-trip correctly throughputStrLn): the runtime'skk_print_haskell_charsruns each Char codepoint through akk_putchar_utf8helper that emits the appropriate 1-4 byte sequence. Bridge-side,unpackFoldrCStringUtf8#(the foldr-style unpacker GHC emits after inlining string list folds) is now recognised alongsideunpackCStringUtf8#, so non-ASCII string literals route through the same UTF-8-decoding cons-list builder as ASCII literals. Seeexamples/hello_utf8.hs.show :: Int -> String,print :: Int -> IO (),show [Int],show (Maybe Int)(Just n/Nothing) all now work natively. Routing:- Int show:
isShowIntWorkerrecognises$w$cshowsPrec2and$fShowCallStack_itos', routed toint_to_haskell_chars. - [Int] show:
isShowIntListMethodrecognises$fShowInt_$cshowList, routed toint_list_to_haskell_chars(formats[n1,n2,n3]). - Maybe show:
knownShowCAFrecognises the prefix CAFs that GHC emits ($fShowMaybe1= "Just ",$fShowMaybe3= "Nothing"), inlining the literal[Char]cons-list at translation time. unpackAppendCString#(both Lit and Var forms) handled in trExpr so the prefix CAF lookup composes withunpackAppendCString# pfx (show inner).isDictArgno longer filters$fShow*CAFs with digit suffixes (they're CString CAFs, not dictionaries) — without this, the first arg ofunpackAppendCString#was being silently dropped.
deriving Showfor user ADTs now works for three common shapes:- Enum-only (
data Color = Red | Green | Blue deriving Show) - Single-constructor with args (
data Pair = Pair Int Int deriving Show) - Multi-constructor uniformly with-args
(
data Tree = Leaf | Node Tree Int Tree deriving Show)
The pieces wired in:
kk_haskell_chars_concatruntime helper ((++) :: [Char] -> [Char] -> [Char]) with bothkk_*and bare-name aliases for PAP wrappers.knownShowCharCAFinlinesshowSpace1=' ',$fShowCallStack2=')',$fShowCallStack3='(',$fShowCallStack4=','.GHC.Internal.Show.itos(unprimed) recognised byisShowIntWorker/pickShowArgsas a 2-arg int-to-cons-list helper.isDictDefkeeps$cshowsPrec/$cshow/$ccompare/$cfmapetc. (derived methods) while still filtering$cshowList(whose body referencesshowList__, currently unshimmed).
Mixed-ADT modules now work:
disambiguateLetBindingspost-pass inFrankenstein.GhcBridge.CoreTranslategives every let-binding a fresh Unique by adding a per-site counter offset, so no two binding sites can collide. Previously the lambda-lifter deduplicated by name and picked the capture set from whichever site it saw first; call sites in the other context then emitted the wrong argument count.Show for tuples works for shapes including 2-tuples, n-tuples (3, 4, 5, …), tuples with negative numbers, and nested tuples (e.g.
((Int, Int), Int)). The key bridge piece:$fShowCallStack_$sgo(showFn1, [showFn2, …, showFnN], tail)is unrolled at translation time viaexpandTupleShowChaintoshowFn1 (',' : showFn2 (',' : … (',' : showFnN tail) …)). The middle list is a static cons-list of closures that GHC fully inlines for fixed-arity tuples;collectStaticShowListwalks it at translation time.- Numbered CAFs
showList__N,$fShowCallStack8(minus sign), etc. added to knownShowCharCAF/knownShowCAF. _$cshowListmatching anywhere in the def name is filtered, including the$s$f...specialised variants.- External GHC
$cshowListreferences (via the never-forced thunk in the Show dict's showList slot) are routed through adummy_show_cafruntime stub so the dict links cleanly.
Still blocked: reading stdin/files, formatted output via
printf/Text.Printf, GADT-style data declarations. - Int show:
-
BRIDGE_rust_strings: Rust
println!(...)works for plain string literals, i64-arg format placeholders, and string (&str) args — including the leading-placeholder form (println!("{} = {}", a, b)). The bridge elidesArguments::<'_>::from_str/from_str_nonconst(thin Arguments wrappers) and pairsArguments::<'_>::new(template, args)with arust_args_packruntime cell;std::io::_printbecomesrust_print_dispatchwhich dispatches at runtime between kk_print_str (from_str path) and kk_rust_print_args (formatted path). The template's raw bytes are preserved through the IR by hex-encoding under a__RBYTES:marker (UTF-8 round-tripping would mangle bytes ≥ 0x80, like the 0xc0 placeholder marker).core::fmt::rt::Argument::<'_>::new_display::<T>is elided to unwrap the value._N.0field access goes throughrust_field_safewhich dispatches between heap-tuple kk_field reads (with kk_retain on the extracted value, since Perceus may drop the parent tuple before the field is consumed) and WithOverflow-flattened identity returns. Per-arg dispatch viakk_is_stringselects between kk_print_str and printf("%ld"). Debug{:?}format works for ints and strings: the bridge wrapsArgument::<'_>::new_debug::<T>with a runtime KK_RUST_DEBUG_TAG cell, and the dispatcher applies a Debug formatter (surrounding"…"quotes and\n/\t/\"/\\escapes for strings; identity to %ld for ints). Radix formats{:x}/{:X}/{:o}/{:b}also work via per-radix Argument wrappers (new_lower_hex/new_upper_hex/new_octal/new_binary). Seeexamples/rust_fmt.rs,examples/rust_dbg.rs,examples/rust_radix.rs. Field-spec syntax: width, alignment (left/right/center/default), zero-pad with sign-aware placement, custom fill chars, AND precision all work. For strings precision truncates byte count ({:.5}on "frankenstein" → "frank"); for ints precision is the minimum digit count, zero-padded between sign and digits ({:.5}on 42 → "00042"). Width and precision compose ({:10.5}on "frankenstein" → "frank "). Decoded from Rust's 4-byte spec + optional u16 width + optional u16 precision, with placeholder markersc0/c1/c3/c5/c7selecting which extras are present (bits encoded:c[bits 0..2]where bit 0=spec, bit 1=width, bit 2=precision-value). Seeexamples/rust_spec.rs.{:+}sign flag and{:#x}alternate form apply correctly. Sign flag prepends+to non-negative numerics (composes with width, precision, zero-pad). Alt form prepends0x(lower-hex and upper-hex),0o(octal),0b(binary) and composes with all earlier features —{:#010x}on 255 →0x000000ff(prefix at the front, zeros between prefix and digits, total width 10). Non-i64 integer types print correctly: the bridge inspects theArgument::<'_>::new_display::<T>type parameter and wraps with a per-type runtime tag (u32/i32/u64/u16/i16/u8/i8); the renderer masks to the correct width and uses %u/%d/%llu as appropriate (e.g. u32 of 4_000_000_000 prints as 4000000000 rather than as the sign-bit-set i32 it'd otherwise look like). Seeexamples/rust_numeric.rs.#[derive(Debug)]user struct printing produces faithfulPoint { x: 7, y: 13 }output. The bridge:- MirParse recognises
Name { field: val, … }syntax and emits an RvStruct rvalue with type name + named fields. - CoreTranslate dispatches to one of
rust_struct_1..rust_struct_8based on field count, passing the type name and comma-joined field names as metadata strings. - Runtime helpers allocate a KK_RUST_STRUCT_TAG cell with fields [name_str, field_names_str, val0, val1, …].
- The Debug formatter reads the metadata and prints
TypeName { field0: val0, field1: val1, … }. - Derived
<impl Debug for T>::fmtbodies still filtered (they reference unshimmed Formatter helpers). Structs with >8 fields fall back to positional(v0, v1, …). Enum variants print correctly across all three shapes: unit (Origin→ "Origin"), tuple (Circle(10)→ "Circle(10)"), and struct (Rect { w: 7, h: 13 }→ "Rect { w: 7, h: 13 }"). MirParse'sparseEnumTupleCtor/parseEnumUnitCtorrecognise theEnum::Variant(args)/Enum::Variantshapes and emit RvStruct rvalues whose name is the last path segment (so the enum prefix is stripped from Debug output, matching Rust). Seeexamples/rust_dbg_adt.rsandexamples/rust_dbg_enum.rs. f64 / f32 floats now route throughrust_arg_f64/rust_arg_f32wrappers — the bridge bit-casts MIR float literals (3.14159f64,0.5f32) into i64 viacastDoubleToWord64/castFloatToWord32inparseFloatBits, and the runtime cell carries the IEEE bit pattern. The printer reinterprets withmemcpyand renders Rust Display output via%g(3.14159, whole numbers trimmed: 3.0 → "3").{:.N}precision re-renders with%.Nf({:.2}of pi → "3.14"), honoring{:+}sign flag and{:W.Pf}width. Seeexamples/rust_float.rs. File I/O works via call-name remap:std::fs::read_to_stringandstd::fs::write::<&str, &str>route to the existing kk_read_file / kk_write_file intrinsics, withResult::<T,E>::unwrapelided at the bridge.io::stdin().read_line(&mut buf)works via a special-cased rebind of the mutable-reference target — the bridge scans the body's statements for_N = &mut _M, rewrites the call tokk_read_line(), and emits a let-shadow that gives_Mthe read line. Seeexamples/rust_file_read.rs,examples/rust_file_write.rs,examples/rust_stdin.rs. Non-ASCII string literals (café / 한국어 / 🎉) round-trip cleanly throughprintln!— the bridge stores the raw UTF-8 bytes viakk_string_from_literaland emits them verbatim (Rust source and stdout are both UTF-8, no decode/re-encode needed). Seeexamples/rust_utf8.rs. Still blocked: float literals that rustc promotes tomain::promoted[N](inlineprintln!("{:.0}", 3.5_f64)— bind to aletfirst).
- MirParse recognises
-
BRIDGE_mercury_strings: Mercury
:- pred main(io::di, io::uo) is det.withio.write_string/io.write_line/io.nlcalls now runs end-to-end. The bridge renames the user'smaintomain_io_impland synthesises a no-argmainalias;io.write_string(S, !IO)is routed to the runtime'sprint_str(andio.write_line/io.print_linetoprintln_str). String literals on the RHS of HLDS unifications bind toLitString. Trailing-period stripping inparseSingleGoalalready let fact-formis detpredicates compile. Seeexamples/hello.m. Still blocked: Mercury's broader io module (read, file handles, formatted output) and complex ADT deconstruction (e.g.examples/shape.m) still hits HLDSunifyfallback paths.
Two directions from ~/Dokumente/frankenstein.directions pursued as one batch to
stress-test opposite corners of the IR: Futhark arrays light up MLIR's linalg
dialect (new backend territory), Scheme call/cc stresses the closure/higher-
order machinery (validates "principled IR" claim).
The Phase 9 Futhark bridge only handled scalar i64. Phase 10a extends it to
emit real linalg.generic reductions for sum_iota n and dot_iota n:
- Allocates
memref<?xi64>viamemref.alloca - Fills it with
scf.foroverarith.index_cast - Reduces via
linalg.generic { iterator_types = ["reduction"], indexing_maps = [affine_map<(d0) -> (d0)>, affine_map<(d0) -> ()>] } - Lowered through
--convert-linalg-to-loops --expand-strided-metadata --finalize-memref-to-llvmadded to the native pipeline
Validated: examples/sum_iota.fut → 4950, dot_iota 100 → 328350.
New Frankenstein.SchemeBridge.{Reader,CoreTranslate} module. The reader is a
hand-rolled S-expression parser; the translator is a Danvy-style HOAS CPS
converter with administrative-redex beta reduction baked in at translation time.
The design decision that makes this work: Frankenstein's existing Evidence.hs
is single-shot inline-only, with no runtime continuation capture. Rather than
add runtime machinery, we CPS-convert the source so (call/cc f) becomes plain
higher-order code — [f]_cps(λfv. fv (λ(v _k). k v) k) — and the existing
PAP/closure machinery carries the reified continuation. First-class continuations
with correct escape semantics, without a single runtime change specific to them.
Uncovered two pre-existing Emitter/runtime bugs in the process:
EApp (non-var) argswas emittingllvm.extractvalueagainst an i64 closure pointer as if it were an LLVM struct. Fixed to use the samekk_fieldindirection as the variable path, with the closure threaded as the leading arg.kk_dropwalked every field of every boxed object — including field 0 of a closure, which holds a raw function pointer into.text, triggering a segfault on the refcount write. Fixed by giving closures a distinct tag (KK_CLOSURE_TAG = 0x434C4F53 "CLOS") and skipping field 0 inkk_drop,for_each_child, andcollect_white. Closures are also marked acyclic.
Validated: examples/arith.scm → 42, examples/escape.scm → 100 (call/cc
bypasses the surrounding (+ 10 ...)). --demo --compile regression still
produces 3628800. All 56 cabal tests pass plus 2 new Scheme structural tests.
Goal: Make EHandle/EPerform support multi-shot continuations natively,
not just zero-shot (abort) and single-shot (tail-resumptive). Distinct from
the specialized mercury_choose binary-choice runtime primitive — this is a
general handler API where the handler receives (args..., resume) and may
invoke resume any number of times with any value.
Frankenstein.Core.EffectOpt.HandlerKind = HKAbort | HKTail | HKMulti and
the classifyHandler function partition handlers by how the last parameter
(the resume continuation) is used:
- 0 references →
HKAbort(existing setjmp/longjmp lowering) - 1 reference in tail position →
HKTail(existing inlining lowering) - 1 in non-tail position or ≥2 references →
HKMulti(CPS lowering)
Frankenstein.Core.CpsConvert — pure Plotkin-style CPS transformation
threaded through a thin Cps monad for fresh-name generation. Handles all
expression forms. The key insight is let-fusion at EPerform sites:
cps[let x = M in N] k = cps[M] (\v -> Let x = v in cps[N] k)
so the handler's continuation captures the rest of the body, not just an identity binder. 12 cabal unit tests cover both the classifier and the CPS converter.
Frankenstein.Core.Evidence.evidenceExpr routes HKMulti handlers
through CPS conversion, then runs substEFunRef to replace the
sentinel EFunRef qn (left by the CPS converter at perform sites)
with EVar evName — a regular variable reference to the handler's
evidence binding. Frankenstein.Core.EffectOpt.inlineLocalHandler
has a matching guard to skip HKMulti so optimization doesn't
collapse multi-shot semantics back to single-shot.
examples/effect_nondet.json:
effect nondet { choose : int }
handler = \dummy resume -> let r1 = resume(1)
in let r2 = resume(0)
in r1 + r2
body = let b = perform choose 0 in case b of 1 -> 10; _ -> 20After multi-shot evidence pass:
let ev_nondet = handler in
ev_nondet(0, \k -> let b = k in case b of 10 | 20)
When resume(1) is invoked, the continuation yields 10. resume(0)
yields 20. Handler returns r1 + r2 = 30.
End-to-end native binary output: 30. First working multi-shot effect handler in Frankenstein, all the way through host → MLIR → LLVM IR → native binary.
k-specs/multishot-claims.k — 9 kprove claims, all proven #Top:
- MS1: Classifier on canonical shapes (abort / tail / multi).
- MS2:
countAppsOfcomposition correctness (literal, direct apply, nested apply). - MS3: Structural invariants (1-param ELam / non-lambda → never multi-shot).
New K helpers in EFFECTOPT-CHECKERS: countAppsOf (and list/BG/branch
variants), isMultiShotHandler.
Full 3-stage bootstrap holds: 24/24 modules match between stage 2 and stage 3, 21/21 E2E tests pass at every stage. Multi-shot infrastructure is dormant for bootstrap modules (none of which use multi-shot handlers) and active only when invoked.
Goal: Module A performs an effect, Module B handles it — effects work across language boundaries after linking.
The problem: The evidence pass ran per-module before the linker, so Module A's
EPerform(exn/fail) was resolved to a default handler call exn_fail() — never
reaching Module B's EHandle(exn, ...).
The fix: Build a global effect registry from all modules' progEffects and
run evidencePassGlobal with that registry on each module before the linker
mangles names. This gives every module visibility into every other module's effect
declarations while keeping names unmangles for correct resolution.
BEFORE: compile → evidencePass (per-module, local effects only) → linker → emit
AFTER: compile → evidencePassGlobal (global registry) → linker → emit
Pipeline change in Main.hs:
- Collect
allEffectDecls = concatMap progEffectsacross all compiled modules - Build global registry via
collectGlobalEffectson the combined declarations - Run
evidencePassGlobal globalEffectson each module before linking - The
--emit-effect-mlirpath still preserves rawEHandle/EPerform
Result: All 97 cabal tests pass (including new cross-module unit test),
5/5 polyglot E2E tests pass, --demo --compile → 3628800.
Goal: Link Frankenstein's self-compiled .o files into a standalone binary that exercises the compiler's own code — proving Frankenstein can bootstrap real work.
Result: All 20 Frankenstein modules compile through the full pipeline
(frankenstein <file.hs> --emit-mlir | mlir-opt | mlir-translate | clang -c) and
link into a 1.4 MB self-hosted binary with 67 passing tests. The binary runs
all four compiler passes (ConTags, Perceus, Evidence, MLIR emission) on Core IR
constructed in C, and the emitted MLIR compiles through the LLVM toolchain to
produce a native binary that computes factorial(10) = 3628800.
Build: bash self-host/build.sh — compiles all 20 modules, links, runs 67
self-tests, validates factorial MLIR through the full pipeline.
| Metric | Value |
|---|---|
| Modules compiled | 20/20 |
| Total .o size | ~2.5 MB |
| Binary size | 1.4 MB |
| Self-tests | 67/67 pass |
| Compiler passes exercised | assignProgramTags, insertPerceus, evidencePass, emitProgramText |
| Full pipeline validation | factorial(10) → MLIR → mlir-opt → clang → 3628800 |
Lazy selector fix: GHC compiles let (a, b) = expr as two lazy selector
thunks that share a cached pair. Perceus inserts drops in each selector for the
unused field, which originally caused use-after-free when both selectors force
the same cached pair. Fix: kk_thunk_force now retains the cached result on
every access (retain-on-force semantics), giving each consumer its own reference.
kk_drop is fully functional — all 12 stage 2 examples pass.
C shim surface: 423 external symbols across Data.Map, Data.Set, Data.Text,
GHC.Internal.*, State monad, and standard library functions. All resolved by
C shims in self-host/shim_*.c implementing minimal versions using the kk_*
runtime's allocation primitives.
What the self-hosted binary proves: Frankenstein's MLIR emitter (2500 lines of Haskell with State monad, Data.Map, Data.Set, Data.Text, pattern matching, closures, lazy evaluation) has been compiled through the compiler's own pipeline and produces correct, optimizable MLIR. The factorial result matches the host compiler's output, proving self-hosted compilation is functionally equivalent.
- 8 bridges: GHC (real API +
foreign import ccallFFI), Rust (MIR text+JSON +extern "C"FFI), Mercury (HLDS), Koka (library API), Python (ast S-expr), Go (go/ast S-expr), Futhark (in-tree Pratt parser), Scheme (S-expr + CPS) - Multi-module compilation: GHC bridge chases imports through the module graph
(
compileToCoreMulti), compiles all home-package modules in a single session, returns[Program]. Cross-module name resolution in the linker (resolveNameparsesModule/nameformat, disambiguates viapreferMod). - Cross-language multi-module: Koka extern declarations call Haskell functions compiled from multiple modules. Demo: 2 Haskell modules + 1 Koka module → single binary → 75.
- Core IR: Multiplicity, effect rows, Perceus ops, laziness ops
- Perceus pass: Drop + retain insertion, formally verified (20 kprove claims)
- Evidence pass: Single-op and multi-op effect dispatch with cross-module resolution, 13 kprove claims
- Linker: Multi-module merging with cross-module name rewriting, 20 kprove claims
- MLIR emitter: func/arith/scf/llvm dialects, lambda lifting, closures with
real function pointers, thunks, bool/char/int/float/string support, cycle candidate
marking (
kk_cycle_candidateafterkk_alloc_conin statically-detected cyclic defs) - Runtime: Perceus RC (
kk_retain/kk_drop), boxed values, thunks, retain-on-force semantics for shared lazy selectors - K specs: OrganIR typing + Perceus + full effect semantics (organ-ir.k, 1229 lines, 240 rules), 118 krun tests (incl. 42 algebraic effect tests with Mercury semidet/choice patterns), 47 bridge property tests, 313 kprove claims (43 Perceus + 155 bridge + 30 evidence + 44 linker + 41 effectopt)
- Effect semantics in K: Full
EPerform/EHandlewith delimited continuation capture, abort (exn) and resume (choice) patterns, nested handler support - K oracle (Phase 2b): QuickCheck differential testing — random OrganIR programs run through both krun and MLIR pipeline, outputs compared (70 property tests)
- Bridge bisimulation (Phase 2c): For each bridge,
krun(translate(source))compared against expected values and native compiler output. Verified:- GHC: arithmetic + factorial(10)=3628800, with native
ghccomparison - Koka: arithmetic, krun matches expected
- Rust: arithmetic, with native
rustccomparison - Mercury: structural (semantic pending HLDS variable resolution)
- Expression cleaning pipeline: strip laziness/Perceus ops, normalize builtins, simplify I# boxing, reorder branches, self-application for recursion
- GHC: arithmetic + factorial(10)=3628800, with native
- Extended kprove claims (Phase 2d): 100 new claims beyond original 20 Perceus:
- Bridge claims (67): all 47 property functions promoted to formal verification with concrete positive/negative test cases per property
- Evidence claims (13): no-EHandle/EPerform post-pass, single-op/multi-op binding structure, nested effect scope preservation
- Linker claims (20): local names preserved, main never mangled, module-prefix mangling correct, call graph preservation, shouldRewrite consistency
- Phase 3a: Haskell RC feasibility ✓: Factorial.hs end-to-end through full pipeline, profiled with instrumented runtime — 14KB binary, 2.3x faster than GHC, zero heap allocs, 42 no-op RC calls. Pain points documented: cycles, thunk chains, dictionaries, sharing, boxing
- Phase 3b: GHC Core patterns ✓: All 4 patterns handled — typeclass dictionaries (resolved at -O1), unboxed primops (+#/-#/<#/==#), worker/wrapper ($w workers), join points (nested cases). Test programs: TypeclassTest(42), UnboxedTest(5050), WorkerWrapperTest(55), JoinPointTest(0)
- Phase 3c: Cycle detection ✓: Bacon-Rajan trial deletion cycle collector in
runtime/kk_cycle.c, static cycle analysis inCore/CycleAnalysis.hs, 10/10 C tests (pairs, triples, self-ref, mixed), 8 new K tests for RC on heap objects - Runtime: Perceus RC with cycle collection, recursive child dropping, nfields side table, color encoding in refcount word
- Phase 3e: Standard library types ✓: Disabled GHC RULES pragmas (
-fno-enable-rewrite-rules) so standard constructors (:,[],Just,Nothing,True,False,(,)) survive -O1 intact instead of being fused intobuild/foldr. AddedcollectReferencedTyConsto walk Core expressions and extract stdlib TyCons viadataConTyCon, merging withmg_tcsso DataDecls appear in progData. Five new end-to-end examples:stdlib_list.hs:sumList [1..5] = 15— standard[Int]with pattern matchingstdlib_maybe.hs:fromMaybe + Just/Nothing = 141— standardMaybe Intstdlib_bool.hs: guards,otherwise,negate = 7— standardBoolstdlib_tuple.hs:swap + addPair = 13— standard(Int, Int)prelude_hof.hs:myMap/myFilter/myFoldr on [Int] = 22— HOFs on stdlib lists with Perceus RCprelude_inline.hs: real Preludemap/filter/sum= 24 — GHC inlines at-O1with-fexpose-all-unfoldings -fspecialise-aggressively -fcross-module-specialiseprelude_comprehensive.hs: real Preludemap/filter/sum/take/zipWith/foldl= 235 All 21 examples compile and run correctly. Prelude HOFs are fully inlined by GHC's aggressive specialization flags — no C shims needed.
- Phase 3d: Benchmark suite ✓: 3 benchmarks (fib/tak/ack) × 4 compilers (Frankenstein/GHC/Rust/Koka),
automated
bench/run.shscript. Frankenstein: 18.6 KB binary (1400x smaller than GHC), lowest memory (1.5 MB), 6x slower than GHC on fib(42) due to no-op retain overhead on unboxed values. Multi-arg lambda collection and nameToSsa fixes for multi-param GHC workers. - Phase 4: MLIR Effect Dialect ✓:
frankenstein.handle/perform/resumeops inDialects.hs, effect-dialect emission mode inEmitter.hs,--emit-effect-mlirCLI flag. Three Core IR optimization passes inEffectOpt.hs: handler inlining, identity handler elimination, tail-resumptive detection. Integrated into pipeline before evidence pass. - Phase 5: Wasm Backend ✓:
--compile --target wasm32produces.wasmbinaries. 485-byte factorial demo runs in Node.js and browser. Freestanding Wasm runtime with bump allocator. Browser demo atweb/index.html. Pipeline: MLIR → llc(wasm32) → wasm-ld. - Self-hosted binary ✓: 23/23 modules compile through own pipeline. 3-stage bootstrap reaches fixed point — stage 2 and stage 3 produce byte-identical MLIR for all 23 modules. All 21 E2E tests pass at every stage. All 4 compiler passes exercised (ConTags, Perceus, Evidence, MLIR). Full pipeline validation: factorial(10) → MLIR → mlir-opt → clang → 3628800.
- FFI cross-language imports: Native FFI mechanisms in major bridges resolve through
the polyglot linker's symbol table, enabling symmetric multi-language composition:
- Haskell
foreign import ccall: GHC bridge detectsFCallIdin Core, extracts C function name fromCCallSpec/StaticTarget, stripsrealWorld#state tokens and unboxed(# State#, result #)tuple destructuring. Haskell functions can call Python, Go, Rust, etc. via standardforeign import ccall "symbol_name"syntax. - Rust
extern "C": MIR bridge now correctly parses call terminators with external targets (_0 = square(copy _1) -> [return: bb1, unwind unreachable]). Fix: MIR text parser no longer wraps call terminators inAssign((...))which prevented the call terminator parser from recognizing them. Rust functions can call Python, Haskell, etc. via standardextern "C" { fn symbol_name(...); }syntax.
- Haskell
- Cross-language coverage: 12 polyglot E2E tests in
test-polyglot.sh:- 3-lang (Haskell+Rust+Koka) → 69
- 4-lang semidet success/failure (Haskell+Rust+Mercury+Koka) → 69/1
- Cross-lang multi-module (Haskell×2+Koka) → 75
- Haskell stdlib cross-lang (map/filter/sum from Koka) → 220
- 7-lang all bridges (Haskell+Rust+Mercury+Python+Go+Futhark+Koka) → 147
- 7-lang multi-module (Haskell×2+Rust+Mercury+Python+Go+Futhark+Koka) → 175
- 12-lang all bridges (Haskell+Rust+Mercury+Python+Go+Futhark+Swift+OCaml+Erlang+F#+Idris+Koka) → 440
- Haskell FFI cross-lang (Haskell
foreign import ccall→ Python+Go+Koka) → 157 - Rust FFI cross-lang (Rust
extern "C"→ Python+Haskell+Koka) → 69
- Organ-bank integration: OCaml shim produces OrganIR JSON consumed end-to-end through
frankenstein's
OrganIR.Consumer→ Core → MLIR → native (factorial(10)=3628800, cube(5)=125). SML/Lua/Erlang/Prolog/Forth frontends produce structured OrganIR; Lua shim consumable but runtime type mismatch (any vs int). C/C++ shims at wrong abstraction level (LLVM IR as strings). - Test suite: 97 cabal tests (incl. cross-module effect test), 12 polyglot E2E, 3 Wasm validation tests, K test oracle, 118 krun tests, 10 cycle collector C tests, 21 self-host E2E examples (pass all 3 stages)
- End-to-end:
--demo --compile→ 3628800,--demo --compile --target wasm32→ 3628800 in Node.js
- Bootstrap fixed point (010016d): 3-stage self-hosted compiler converges — 23/23
modules produce byte-identical MLIR in stages 2 and 3. Key fix: refactored monolithic
emitAppVar(~490 lines, 936KB OrganIR) into 6 arity-dispatched functions to eliminate a compiler hang during self-compilation. Size-based JSON splitting (400KB target) with minification.compile_stage()andrun_e2e_tests()extracted as reusable bash functions. All 21 E2E tests pass at all 3 stages. - Stage 2 bootstrap: 23/23 — per-part fallback with global injection, cabal exec race fix, KokaCore TypeDefs, guard desugaring. Two modules (MlirEmit/Emitter, KokaBridge/Driver) still crash 2-3 split-parts each but fallback extraction +
fix-missing-else.pytruncated-scf repair +llvm.mlir.globalinjection from stage1 produces valid .o files. Regex fix:extract-mlir-funcs.pyglobal name capture was\S+(greedily matched into string content), now[A-Za-z0-9_.$]+. Stage 2 compiler passes all 21 E2E tests. - FFI cross-language imports — Haskell
foreign import ccalland Rustextern "C"now resolve through the polyglot linker. GHC bridge detectsFCallIdvars viaidDetails, extracts C function names fromCCallSpec/StaticTarget, stripsrealWorld#state tokens and unboxed tuple(# State# RealWorld, result #)destructuring. MIR bridge fix:convertTextLineno longer wraps call terminators inAssign((...)), enabling proper parsing of_0 = func(args) -> [return: bbN, ...]patterns. Two new demo tests: Haskell FFI cross-lang (Haskell→Python+Go+Koka → 157), Rust FFI cross-lang (Rust→Python+Haskell+Koka → 69). Polyglot test suite now at 12 tests (11 passing, 1 pre-existing Mercury choice issue). - 12-language demo — all 12 direct-style in-tree bridges (Haskell, Rust, Mercury, Python, Go, Futhark, Swift, OCaml, Erlang, F#, Idris, Koka) compose into a single binary → 440. Each function compiled through its real compiler's API/IR. Organ-bank OCaml shim verified end-to-end through OrganIR JSON → Consumer → Core → MLIR → native.
- Expanded cross-language multi-module coverage — 7-language demo (Haskell+Rust+Mercury+Python+Go+Futhark+Koka → 147), 7-language multi-module demo (Haskell×2 + 5 languages → 175), Haskell stdlib cross-language (map/filter/sum called from Koka → 220). Polyglot test suite expanded to 10 tests. Confirmed Prelude HOFs (map/filter/foldr/sum/take/zipWith/foldl) are fully inlined by GHC at -O1 with aggressive specialization flags.
- Multi-module GHC bridge + cross-language demo + cycle collector wiring —
compileToCoreMultichases imports through GHC module graph,resolveNamehandlesModule/namecross-module references,CycleAnalysisresults wired into MLIR emitter viaesCyclicDefs/emitCycleCandidate, cross-language demo (2 Haskell + 1 Koka → 75), cross_module added to Phase 8 (19 examples pass stage 1, 13 pass stage 2). - Phase 3f: String support + builtins as first-class values + stage 2 segfault fix — Three fixes: (1)
builtinWrapperSpecin Emitter.hs generates wrapper closures for+,-,*,/,mod,==,<, etc. when used as first-class values (HOF arguments); (2) Address primopsindexCharOffAddr#/plusAddr#for post-simplifierunpackCString#byte-walking loops, withLitStringdual semantics (cons-list in Core IR, rawAddr#pointer in emitter); (3)fix-intra-module-calls.pygenerates 86 MLIR wrapper functions for split-compiledMlirEmit_Emitter— the split compilation brokeesTopFnspopulation causing cross-part function calls to resolve to null. All 18 host-compiled examples pass. Stage 2 compiler no longer segfaults — all 12 examples pass through stage 2 (alloc_stress fixed via retain-on-force in kk_thunk_force). - Phase 3e: stdlib types — disable RULES, collect referenced TyCons, 5 new stdlib examples (list/maybe/bool/tuple/hof), all 16 examples pass
- Fix MlirEmit_Emitter stage2 compilation — split emitExpr, flatten deep nesting, 10-part split-compile, 23/23 modules compile, 12/12 e2e tests pass
- Self-hosted binary — 19/20 modules compile through own pipeline, link into 800 KB binary, 17/17 self-tests pass. Lambda/thunk module-prefix fix in emitter.
self-host/build.sh+self-host/main.c. - Cross-module effect dispatch — global effect registry enables Module A to perform effects handled by Module B. Pipeline reorder:
evidencePassGlobalwith combined registry runs before linker name-mangling. 97 cabal tests, 5/5 polyglot E2E. ae4f4ee— Phase 2: K as living specification, 116 krun tests, noPatterns function, Mercury semidet/choice krun tests- Phase 9: Go + Futhark frontends (6th and 7th languages) — Two new bridges added in one go. Go (
Frankenstein.GoBridge.{AstParse,CoreTranslate}) shells out to a small Go helper atgo-bridge/ast_to_sexp.gothat uses the standard librarygo/parser+go/astto dump a tightly-restricted S-expression. The Haskell side runs the helper (auto-builds it viago buildon first invocation), parses S-exprs (mirrors the Python S-expr parser), and translates the same statement-block early-return shape used by the Python bridge. Supported subset:func,return,if/else,Assign,BasicLit(int),Ident,BinaryExpr,UnaryExpr,CallExpr,ParenExpr,GenDecl→Skip. Op tokens align directly with canonical primitives (+,<=,%→mod,&→andI#, etc.). Goroutines/channels/methods/interfaces/structs/slices are explicitly out of scope. Futhark (Frankenstein.FutharkBridge.{Parser,CoreTranslate}) is fully in-tree — no externalfutharkbinary dependency. A ~270-line hand-rolled Pratt/precedence-climbing parser inParser.hsaccepts top-levellet name (p: t) ... : ret = exprdefinitions, integer literals, identifiers, function application by juxtaposition, parens, binary ops (+ - * / %arith,== != < <= > >=comparisons,& | ^bitwise) with proper precedence levels (2/3/4), unary minus,if/then/else, andlet x = e in body. Type annotations are accepted and discarded (everything isi64). Arrays, SOACs, modules, lambdas, records, tuples are deliberately rejected. Both bridges wired intocompileFilevia.goand.futextensions. End-to-end:examples/factorial.go --compileandexamples/factorial.fut --compileeach produce native binaries that print3628800. Test suite: 56 cabal tests (52 prior + 4: arith.go K-bisim, arith.fut K-bisim, factorial.go structural, factorial.fut structural). The factorial K-bisim is structural-only for the same reason as Python (early-return →case (n<=1) of 0 -> ... ; _ -> ...doesn't match the K oracle's constructor-pattern expectation), but the native pipeline handles them correctly. - Phase 8: Python frontend (5th language) —
Frankenstein.PythonBridge.{AstParse,CoreTranslate}shells out to a smallpython-bridge/ast_to_sexp.pyhelper that walksast.parse()and emits a tightly-restricted S-expression. The Haskell side parses S-exprs (35 lines, no aeson dep) and translates to OrganIR. Supported subset:def,return,if/else(early-return pattern),Assign, integer literals,Name,Call, binary ops+ - * // %, comparisons== != < <= > >=, unary negate. All values typed asint, multiplicityMany. Operator names map directly to the canonical primitives the MLIR emitter recognises (+,<=, etc.) so no special-casing was needed inEmitter.hs. Wired intocompileFilevia.pyextension. End-to-end:examples/factorial.py --compileproduces a 19 KB ELF that prints3628800. Test suite: 52 cabal tests (50 prior + arith.py K-bisim + factorial.py structural). The factorial K-bisim is structural-only because the early-return pattern desugars tocase (n<=1) of 0 -> ... ; _ -> ...and the K oracle expects constructor patterns, notPatLiton comparison results — the native pipeline handles it correctly. - Phase 7: K-verify EffectOpt — 18 kprove claims for the three EffectOpt passes (
inlineLocalHandlers,eliminateIdentityHandlers,annotateTailResumptive) ineffectopt-claims.k. Helper predicates added toEFFECTOPT-CHECKERSmodule inall-claims-def.k:isIdentityHandler(uncurried + curried\(x,k) -> k(x)),isTailCall/allBranchesTailCall(recursive structural check throughELet/ECase),isTailResumptiveHandler,countPerforms/countPerformsList(per-effect-module count),noPerformsOf. Claim groups: EO1 identity-handler recogniser (5 claims, positive + negative cases), EO2 tail-call detection (4 claims), EO3 tail-resumptive detection (4 claims), EO4 perform counting (5 claims). All 18 claims#Topunder kprove (Haskell backend), no rewrites needed — pure functional checks. Existing perceus/evidence/linker/bridge claim files still pass against the regenerated definition. - Phase 6e: Self-hosted MLIR → native objects — all 17 self-hosted modules lower cleanly through
mlir-opt --convert-{scf,arith,cf,func}-to-llvm→mlir-translate --mlir-to-llvmir→clang -cto real ELF objects (totalling ~1.8 MB; Emitter.o is 702 KB). Required fix: everyfunc.constant @fn→ i64 cast now goes via!llvm.ptr(unrealized_conversion_castto!llvm.ptrthenllvm.ptrtoint) soreconcile-unrealized-castscan erase the intermediate after--convert-func-to-llvm. Direct func-type → i64 casts were leaking past mlir-translate as LLVM-incompatible types. End-to-end runnable: linked the self-hostedCore/Types.oagainst the C runtime and a small driver, called the frankenstein-compiledbindName/bindExprrecord selectors on a heap-allocated Bind, got back the correct field values. All 50 cabal tests pass;--demo --compilestill produces 3628800. - Phase 6d: Self-hosting cleanup — 17/18 modules emit fully valid MLIR (the 18th, OrganIR/Consumer.hs, fails earlier in the GHC frontend due to a pre-existing text-2.1.3/2.1.4 package skew unrelated to MLIR). PAP closures via
kk_alloc_confor undersaturated top-level calls, oversaturated path that calls then closure-indirects the remainder, uniform i64 ABI at top-level fn boundaries, string literals immediatelyptrtoint-ed to i64, ELet alias scoping (save/restore around let body to prevent leakage into sibling scf.if branches),ETypeLamstripping inemitDefso emitted arity matchesbuildTopFnArity, and droppingllvm.unreachablein unhandled-case fallback (was illegal inside scf.if regions). All 50 cabal tests pass;--demo --compilestill produces 3628800. - Phase 6c: Full self-hosting — all 18 modules through GHC bridge, 3 emit fully valid MLIR, 14 have 1–12 residual errors out of thousands of lines, closure ABI + scf.if alias scoping + func.constant fptrs
- Phase 6b: Self-hosting Perceus.hs — closure ABI via kk_alloc_con, capture filter, lambda param renaming
- Phase 6a: Self-hosting bootstrap — Core/Types.hs through GHC bridge → MLIR validates clean
- Phase 5: Wasm backend — --target wasm32, 485-byte .wasm, browser demo, Node.js validation
- Phase 4: MLIR effect dialect — frankenstein.* ops, effect optimizations, --emit-effect-mlir
- Phase 3d: Benchmark suite — fib/tak/ack × 4 compilers, multi-arg lambda fix, nameToSsa
- Phase 3c: Cycle detection — Bacon-Rajan collector, static analysis, C tests, K tests
- Phase 3b: GHC Core patterns — primops, lambda-not-thunk, Bool→i64, negate, test programs
- Phase 3a: Haskell RC feasibility — Factorial.hs E2E, I# simplification, print builtin, profiled runtime
89367e2— Phase 1: Fix polyglot demo — semidet test result, HLDS comment parsing, external runtime dispatch9ef75e4— Redirect all diagnostic output to stderr for clean MLIR piping54e11e1— Phase 4: Fix effect optimization traversals and --emit-effect-mlir pipeline- Phase 2d: extended kprove claims (313 total: perceus, bridge, evidence, linker, effectopt)
- Phase 2c: bridge bisimulation proofs (GHC, Koka, Rust, Mercury, Python, Go, Futhark, Scheme)
ac1a533— Phase 1b: polyglot test suite, Mercury choice effect (multi-shot)093f0ce— Closures, thunks, MIR parsing, linker, evidence, strings95f43c8— Perceus retains, MLIR improvements, data decls, tests, kprove3a447a8— Bridge property K specs, kprove claims, MLIR boolean codegen fix
- Multiplicity annotations: Linear (use once), Affine (use at most once), Many (unrestricted) — inferred per-bridge, refined by Perceus
- Effect rows: Open rows with extension/variables — Koka-style
- Perceus RC: Drop for unused, retain for multi-use, skip for linear
- Laziness: EDelay/EForce with memoizing thunks — Haskell semantics
- GHC: strict → no delay, lazy → delay, absent → dead, forall → KStar+Many
- Rust: all args affine, move → no retain, copy → retain, scope exit → drop
- Mercury: det → pure, semidet → exn, multi → choice, di/uo → linear
- Koka: all mul Many (Perceus re-derives), effects preserved, synonyms expanded
OrganIR → Evidence Pass → Perceus → MLIR Text → mlir-opt → mlir-translate → clang + kk_runtime.c → a.out
organ-ir.k → kompile (LLVM backend) → krun tests (104 pass)
organ-ir.k → kompile (Haskell backend) → kprove perceus-claims.k (20 verified)
all-claims-def.k → kompile (Haskell backend) → kprove bridge-claims.k (67 verified)
all-claims-def.k → kompile (Haskell backend) → kprove evidence-claims.k (13 verified)
all-claims-def.k → kompile (Haskell backend) → kprove linker-claims.k (20 verified)
all-claims-def.k → kompile (Haskell backend) → kprove effectopt-claims.k (18 verified)
bridge-properties.k → kompile (LLVM backend) → krun tests (47 pass)
bridge bisimulation → krun(translate(source)) == native compiler (7 tests)
K oracle → krun(random_expr) == MLIR_pipeline(random_expr) (70 property tests)