A real-time music generation plugin for Apple Silicon Macs, built on the Magenta RealTime model stack (SpectroStream + MusicCoCa + Depthformer) and an optimized C++ MLX runtime.
stylestreamer.mov
Minimum macOS: 26.0 (matches the bundled libmlx.dylib)
StyleStreamer generates continuous music audio in configurable chunks conditioned on up to four text style prompts, blended by weight. Chunk length is adjustable from 40 ms (one codec frame) up to 2 seconds; the default is 400 ms, balancing generation overhead against playback latency. You drag style cards between a live mixing row and a scrollable card bank, edit prompts inline, and copy/paste the full mix state as base64 JSON for external storage. A Downtempo toggle plays generated audio at a reduced effective rate (32 kHz source at 44.1 kHz playback), producing a slower, lower-pitched texture with a smooth sigmoid ramp on toggle to avoid crackles.
| Component | Purpose | In / Out |
|---|---|---|
| SpectroStream | Audio codec | Stereo audio ↔ 64-RVQ discrete tokens at 25 Hz |
| MusicCoCa | Style embedding | Text prompts → 768-dim embedding (12 RVQ) |
| Depthformer | Token generation | Context + style tokens → next-chunk tokens |
The generation loop produces audio chunks (40 ms – 2 s, default 400 ms) conditioned on 10 seconds of context. The C++ MLX runtime runs all three components natively on Apple Silicon via Metal.
magenta-rt-rewrite/
vendor/
magenta-realtime-mlx-cpp/ # Optimized C++ MLX runtime (submodule)
packages/
magenta-rt-juce/ # StyleStreamer JUCE plugin
source/ # PluginProcessor, PluginEditor, engine, UI
tests/ # Catch2 unit + snapshot tests
assets/ # SVG card backgrounds, images, data
cmake/ # Pamplejuce CMake helpers
JUCE/ # JUCE framework (submodule)
modules/ # JUCE modules (melatonin-inspector, CLAP, etc.)
scripts/
bundle_juce_standalone_dylibs_macos.sh
generate_juce_packaging_icon.sh
Makefile
CHANGELOG.md
- macOS 15.0+ (Sequoia), Apple Silicon
- Xcode command-line tools (
xcode-select --install) - CMake ≥ 3.25 and Ninja (
brew install cmake ninja) - Homebrew sentencepiece and portaudio (runtime dylibs)
brew install cmake ninja sentencepiece portaudiogit clone --recurse-submodules <repo-url>
cd magenta-rt-rewrite# Build standalone app (Release)
make juce-build
# Build and launch
make juce-runOr manually via CMake:
cmake -S packages/magenta-rt-juce -B packages/magenta-rt-juce/build-mlx \
-G Ninja -DCMAKE_BUILD_TYPE=Release
cmake --build packages/magenta-rt-juce/build-mlx --target StyleStreamerJuce_StandaloneUse CMAKE_BUILD_TYPE=Debug make juce-build for a debug build.
StyleStreamer downloads model weights from Hugging Face on first use. Click
Download weights in the Advanced… window — the app installs
huggingface_hub into an isolated temporary Python venv (nothing touches system
Python), downloads the snapshot, and fills the weights field with the resolved
path.
Weight search order: MRT_JUCE_WEIGHTS_DIR env var → bundled/manual
model-weights folders → standard ~/.cache/huggingface/hub snapshot →
repo .weights-cache/ fallback.
To pre-populate the cache from the command line:
make ensure-weights-cachemake juce-distProduces dist/stylestreamer-<VERSION>-macos-arm64.zip. The archive unpacks to
StyleStreamer.app/ at the archive root. The bundle includes:
Contents/Frameworks/libmlx.dylib+mlx.metallib(from the repo.venvorMLX_ROOT)- Homebrew sentencepiece and portaudio dylibs
install_name_toolrewrites and ad-hoc codesign so dyld does not halt with "Code Signature Invalid" on the recipient's machine- A packaged Hugging Face weight helper under
Contents/Resources/weight-helper/
# Build and run all unit tests
cmake --build packages/magenta-rt-juce/build-mlx --target Tests
./packages/magenta-rt-juce/build-mlx/Tests/Tests
# Run a specific tag
./packages/magenta-rt-juce/build-mlx/Tests/Tests "[processor-audio]"
# Capture a PNG snapshot of the plugin editor (requires MRT_JUCE_UI_SNAPSHOT=1)
MRT_JUCE_UI_SNAPSHOT=1 ./packages/magenta-rt-juce/build-mlx/Tests/Tests "[ui]"Snapshot images are written to output/<date>-juce-editor-snapshot/plugin-editor.png
(default 2× scale; set MRT_JUCE_UI_SNAPSHOT_SCALE=1 for 900×900).
The C++ MLX runtime (vendor/magenta-realtime-mlx-cpp) employs several
optimizations to approach real-time throughput on Apple Silicon:
- Fused Metal attention —
mx::fast::scaled_dot_product_attentionfuses softmax and QKV matmuls into a single Metal kernel across all three transformer stacks (SpectroStream encoder, Depthformer encoder, temporal and depth decoders). mx::compileon fixed-shape paths — the Depthformer encoder and selected decode steps are compiled viamx::compile, reducing Metal kernel dispatch overhead significantly for the 800-step autoregressive loop.- MLXFN precompiled function bundles — when enabled (default), the runtime
loads a precompiled
mlxfn/bundle (encode + depth + temporal) that bypasses Python-level graph tracing on every chunk, giving the largest single latency win. The Advanced… window reports an error on load if the bundle is missing for the chosen tag/dtype. - Speculative depth decoding — for chunks after the first, draft tokens are built from the previous chunk's tokens at the same frame position (exploiting temporal continuity in music), verified in a single causal forward pass, and accepted up to the first mismatch. Expected acceptance rate 50–75% per depth token, reducing sequential depth steps per frame.
- Vocab masks pre-evaluated — per-RVQ-level codec token masks are computed once before the generation loop and held as MLX arrays, avoiding lazy-graph bloat during autoregressive decode.
- Minimized
mx::evalcalls — eval boundaries are placed only where a concrete value is required for the next input (e.g., sampled token IDs), not after every operation.
Brian Cruz (@rhymeswithlion)
StyleStreamer source code is licensed under the Apache License, Version 2.0.
This project incorporates or links against several third-party components with their own licenses. See NOTICE for the full list, including:
- Magenta RealTime model architecture and weights — Apache 2.0 / CC-BY 4.0 (Google)
- JUCE framework — AGPL v3 (open source) or JUCE 8 Commercial Licence
- Apple MLX (
libmlx.dylib) — MIT License (Apple Inc.) - SentencePiece — Apache 2.0 (Google)
- PortAudio — MIT License