-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
I’m experimenting with the dual-core Rocket design on Arcilator, running it on both RTX 4090 and RTX 4080 GPUs. So far, the simulator tops out at around 50 – 57 k cycles/s. My profiling suggests that the deep logic levels are the primary bottleneck. I’m considering collapsing the AIG netlist into 6-input LUT cells—similar to how GEM packs SRAM LUTs—to shorten the logic depth. If this works, I estimate we could push performance to roughly 100 – 200 k cycles/s, making the GPU flow much more competitive with FPGA-based emulators.
Metadata
Metadata
Assignees
Labels
No labels