Thanks to visit codestin.com
Credit goes to github.com

Skip to content

perf: replace talc with an in-tree bump allocator#45

Merged
malt3 merged 1 commit into
mainfrom
bump-allocator
Jun 16, 2026
Merged

perf: replace talc with an in-tree bump allocator#45
malt3 merged 1 commit into
mainfrom
bump-allocator

Conversation

@malt3

@malt3 malt3 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

The stub is a short-lived process that parses the manifest, resolves paths, then execve's / ExitProcess's: it never frees, so a general- purpose allocator is overkill. Replace talc with a tiny bump allocator (single Cell offset into the existing 8 MiB .bss arena; dealloc is a no-op; single-threaded so no synchronization) and drop the talc dependency entirely.

Smaller on every release binary:

x86_64-unknown-linux-musl 17168 -> 15048 (-2120)
aarch64-unknown-linux-musl 16560 -> 14480 (-2080)
s390x-unknown-linux-musl 18616 -> 16496 (-2120)
x86_64-apple-darwin 29444 -> 25316 (-4128)
aarch64-apple-darwin 67008 -> 50448 (-16560)
x86_64-pc-windows-gnullvm 28672 -> 26624 (-2048)
aarch64-pc-windows-gnullvm 27648 -> 25600 (-2048)
TOTAL 205116 -> 174012 (-31104, ~15%)

The arm64-darwin drop is a full 16 KB page: the smaller code pulls __TEXT back under the 16 KB Mach-O page boundary it previously spilled over. Alloc alignment/bounds/OOM logic verified by a host round-trip test; integration test still passes.

The stub is a short-lived process that parses the manifest, resolves
paths, then execve's / ExitProcess's — it never frees, so a general-
purpose allocator is overkill. Replace talc with a tiny bump allocator
(single Cell offset into the existing 8 MiB .bss arena; dealloc is a
no-op; single-threaded so no synchronization) and drop the talc
dependency entirely.

Smaller on every release binary:

  x86_64-unknown-linux-musl   17168 -> 15048  (-2120)
  aarch64-unknown-linux-musl  16560 -> 14480  (-2080)
  s390x-unknown-linux-musl    18616 -> 16496  (-2120)
  x86_64-apple-darwin         29444 -> 25316  (-4128)
  aarch64-apple-darwin        67008 -> 50448  (-16560)
  x86_64-pc-windows-gnullvm   28672 -> 26624  (-2048)
  aarch64-pc-windows-gnullvm  27648 -> 25600  (-2048)
  TOTAL                      205116 -> 174012  (-31104, ~15%)

The arm64-darwin drop is a full 16 KB page: the smaller code pulls
__TEXT back under the 16 KB Mach-O page boundary it previously spilled
over. Alloc alignment/bounds/OOM logic verified by a host round-trip
test; integration test still passes.
@github-actions

Copy link
Copy Markdown

Binary Size Report

Comparing 14d80e41 (base) → 343b56f5 (PR)

Binary Base PR Change
finalize-stub-aarch64-linux 2.97 MiB 2.97 MiB
finalize-stub-aarch64-macos 2.64 MiB 2.64 MiB
finalize-stub-aarch64-windows.exe 2.72 MiB 2.72 MiB
finalize-stub-s390x-linux 3.03 MiB 3.03 MiB
finalize-stub-x86_64-linux 3.52 MiB 3.52 MiB
finalize-stub-x86_64-macos 3.07 MiB 3.07 MiB
finalize-stub-x86_64-windows.exe 3.10 MiB 3.10 MiB
runfiles-stub-aarch64-linux 16.2 KiB 14.1 KiB -2,080 B (-12.56%)
runfiles-stub-aarch64-macos 65.4 KiB 49.3 KiB -16,560 B (-24.71%)
runfiles-stub-aarch64-windows.exe 27.0 KiB 25.0 KiB -2,048 B (-7.41%)
runfiles-stub-s390x-linux 18.2 KiB 16.1 KiB -2,120 B (-11.39%)
runfiles-stub-x86_64-linux 16.8 KiB 14.7 KiB -2,120 B (-12.35%)
runfiles-stub-x86_64-macos 28.8 KiB 24.7 KiB -4,128 B (-14.02%)
runfiles-stub-x86_64-windows.exe 28.0 KiB 26.0 KiB -2,048 B (-7.14%)

Total: 21.24 MiB → 21.21 MiB (-31,104 B (-0.14%))

@malt3 malt3 merged commit fcad535 into main Jun 16, 2026
6 checks passed
@malt3 malt3 deleted the bump-allocator branch June 16, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant