This repository (SwiftLM) is currently heavily coupled with local instances of its backend dependencies to natively support the baa-ai/GLM-5.1-RAM-270GB-MLX model.
As of April 2026, the local workspace is structured in three active W.I.P repositories that must be maintained together on the glm5.1 branch for the SwiftLM engine to boot correctly.
- Path:
/Users/simba/SwiftLM - Branch:
glm5.1 - Purpose: Inference API, terminal logging, and profiling benchmark orchestrator.
- Status:
- Forced
Package.swiftto point to the local../mlx-swiftand./mlx-swift-lmpaths instead of resolving GitHub remotes to allow cross-repo C++ debugging. - Updated
profile_runner.pyto pipe standard output for runtime overcommit monitoring.
- Forced
- Path:
/Users/simba/SwiftLM/mlx-swift-lm - Branch:
glm5.1 - Purpose: Defines MLX graph execution (Architecture mapping & Hugging Face Safetensors index bridging).
- Status:
- Defines
GLMMoeDSAarchitecture and dense MLPSwitchGLUlayers. - Uses aggressive parameter pruning during
sanitize()to gracefully dropself_attn.indexerkeys and execute the model using standardMulti Latent Attention(MLA).
- Defines
- Path:
/Users/simba/SwiftLM/mlx-swift - Branch:
glm5.1 - Purpose: MLX C-Bindings, kernel dispatch, and zero-copy NVMe SSD streaming.
- Status:
- Unlocked unbounded memory chunk streaming in
LoadSSDExpert. Removed the hardcoded[SSDStreamer] Load length exceeds Pinned Buffer capacitycondition insidessd_streamer.mmto allow streaming aggregated massive tensor blocks directly into Metal-configured Unified Memory.
- Unlocked unbounded memory chunk streaming in
Before deploying or returning SwiftLM to production (main):
- Wait for
SharpAI/mlx-swiftto successfully merge theglm5.1branch updates. - Wait for
SharpAI/mlx-swift-lmto merge the GLM-5.1 specific parameter logic. - Update
Package.swifthere in the root to repoint to the latest GitHub release tags instead of local../mlx-swiftpaths.