This project addresses the Compiler & Runtime challenge by implementing a highly optimized, massively parallel N-Body physics engine.
The goal was to demonstrate how to take a computationally expensive simulation (
- Massively Parallel Execution: Leveraged
rayonto parallelize the force calculation step, saturating available CPU cores. - Determinism Guarantee: Architected the simulation loop to separate Read (Force Calc) and Write (Integration) phases. This ensures that the parallel execution yields mathematically identical results to the serial execution (verified via unit tests).
- Observability: Integrated
tracingfor structured logging, allowing granular performance profiling of individual ticks. - Cache Efficiency: Utilized Structure-of-Arrays (SoA) patterns and contiguous memory layouts to minimize cache misses during the hot loop.
Running on a MacBook Pro (M3 Pro), simulating 15,000 bodies:
| Mode | Execution Time (Avg/Tick) | Speedup |
|---|---|---|
| Serial (Baseline) | 325.41 ms | 1x |
| Parallel (Optimized) | 55.21 ms | ~5.9x |
Note: Parallel overhead prevents scaling at low body counts (<1,000). The system is tuned for high-load scenarios.
Parallel Mode (Fast):
cargo run --release -- --mode parallel --count 15000Serial Mode:
cargo run --release -- --mode serial --count 15000