Profile lazy backward compilation with cProfile#186214
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/186214
Note: Links to docs will display an error until the docs builds have been completed. ⏳ 54 Pending, 1 Unrelated FailureAs of commit 16b19e6 with merge base 16de8e3 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@claude review these changes |
|
Claude finished @jansel's task in 1m 6s —— View job Review of PR #186214: Profile lazy backward compilation with cProfile
Overall this is a clean refactor — moving the cProfile logic to a shared location and wiring it into lazy backward compilation. The code is well-structured and the test covers the intended behavior. A few observations:
|
Stack from ghstack (oldest at bottom):
TORCH_COMPILE_CPROFILE only wrapped Dynamo's forward frame compilation path via _compile_inner. Lazy AOTAutograd backward lowering restores the saved CompileContext when backward() first needs to lower the backward graph, but then called the backward compiler directly. That meant a valid compile trace id was present, yet no cProfile profile was produced for the backward compiler.
Move the shared cProfile wrapper into torch._dynamo.profiler so non-convert_frame compile paths can reuse it, keep convert_frame aliases for existing internal imports, and wrap the lazy backward compiler call while the saved CompileContext is active. The wrapper is only applied when Dynamo cProfile is enabled and a trace id exists.
The alternative was to add a local cProfile implementation in AOTAutograd, but sharing the existing helper keeps forward and backward profile naming, logging, SVG generation, and fb-only upload behavior consistent.
Fixes #137996
Generated by my agent
Test Plan:
Note: python test/inductor/test_torchinductor.py -k check_stack_no_cycles -v was attempted but blocked during import by a local torchvision::nms registration mismatch.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @chauhang @amjames @jataylo @azahed98