Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Profile lazy backward compilation with cProfile#186214

Draft
jansel wants to merge 1 commit into
gh/jansel/1332/basefrom
gh/jansel/1332/head
Draft

Profile lazy backward compilation with cProfile#186214
jansel wants to merge 1 commit into
gh/jansel/1332/basefrom
gh/jansel/1332/head

Conversation

@jansel
Copy link
Copy Markdown
Contributor

@jansel jansel commented Jun 4, 2026

Stack from ghstack (oldest at bottom):

TORCH_COMPILE_CPROFILE only wrapped Dynamo's forward frame compilation path via _compile_inner. Lazy AOTAutograd backward lowering restores the saved CompileContext when backward() first needs to lower the backward graph, but then called the backward compiler directly. That meant a valid compile trace id was present, yet no cProfile profile was produced for the backward compiler.

Move the shared cProfile wrapper into torch._dynamo.profiler so non-convert_frame compile paths can reuse it, keep convert_frame aliases for existing internal imports, and wrap the lazy backward compiler call while the saved CompileContext is active. The wrapper is only applied when Dynamo cProfile is enabled and a trace id exists.

The alternative was to add a local cProfile implementation in AOTAutograd, but sharing the existing helper keeps forward and backward profile naming, logging, SVG generation, and fb-only upload behavior consistent.

Fixes #137996
Generated by my agent

Test Plan:

  • TMPDIR=$(mktemp -d) TORCH_COMPILE_CPROFILE=1 python - <<'PY' ... torch.compile(fn, backend='inductor'); y.backward(); list profile files ... PY
  • python test/dynamo/test_profiler.py -k test_cprofile_profiles_lazy_backward_compile
  • python test/dynamo/test_profiler.py
  • PYTHONDONTWRITEBYTECODE=1 python test/dynamo/test_profiler.py -k test_cprofile_profiles_lazy_backward_compile -v
  • git diff --check
  • git diff --cached --check
  • lintrunner -a

Note: python test/inductor/test_torchinductor.py -k check_stack_no_cycles -v was attempted but blocked during import by a local torchvision::nms registration mismatch.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @chauhang @amjames @jataylo @azahed98

[ghstack-poisoned]
@pytorch-bot pytorch-bot Bot added ciflow/inductor ciflow/torchtitan Run TorchTitan integration tests module: dynamo labels Jun 4, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 4, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/186214

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 54 Pending, 1 Unrelated Failure

As of commit 16b19e6 with merge base 16de8e3 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 4, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@jansel
Copy link
Copy Markdown
Contributor Author

jansel commented Jun 4, 2026

@claude review these changes

@claude
Copy link
Copy Markdown

claude Bot commented Jun 4, 2026

Claude finished @jansel's task in 1m 6s —— View job


Review of PR #186214: Profile lazy backward compilation with cProfile

  • Gather context and read changed files
  • Review torch/_dynamo/profiler.py changes
  • Review torch/_dynamo/convert_frame.py changes
  • Review torch/_functorch/_aot_autograd/runtime_wrappers.py changes
  • Review test/dynamo/test_profiler.py changes
  • Post review feedback

Overall this is a clean refactor — moving the cProfile logic to a shared location and wiring it into lazy backward compilation. The code is well-structured and the test covers the intended behavior. A few observations:

torch/_dynamo/profiler.py

  1. _callable_name handles functools.partial but not other wrappers — The new helper correctly unwraps partial, but the backward use-site passes a local compile_backward closure whose __name__ is "compile_backward". This produces the profile filename compile_backward_<trace_id>.profile, which the test asserts. This is clear and intentional — just noting that it's the local function name, not the actual backend compiler name, that appears in the profile filename. This seems fine for discoverability (you get "backward" in the name).

  2. Missing space in concatenated warning string (pre-existing, line 120-121):

    "Failed to generate SVG from profile -- dumping stats instead."
    "Try installing gprof2dot and dot for a better visualization"

    These two string literals are implicitly concatenated with no space/newline between "instead." and "Try". This is a pre-existing issue that was just moved, so not blocking.

torch/_dynamo/convert_frame.py

  1. Module-level re-exports (lines 203-204):
    cprofile_wrapper = dynamo_profiler.cprofile_wrapper
    maybe_cprofile = dynamo_profiler.maybe_cprofile
    Good — this preserves backward compatibility for any internal imports from convert_frame without aliasing the whole module.

torch/_functorch/_aot_autograd/runtime_wrappers.py

  1. Wrapping logic (lines 2856-2862):

    def compile_backward() -> Callable[..., Any]:
        return bw_compiler(copy.deepcopy(bw_module), placeholder_list)
    
    if dynamo_config.cprofile and CompileContext.current_trace_id() is not None:
        compile_backward = cprofile_wrapper(compile_backward)
    
    self.compiled_bw = compile_backward()

    This is well-guarded — the CompileContext.current_trace_id() is not None check prevents the AssertionError in cprofile_wrapper from firing when there's no trace context. The wrapper is applied only when both conditions are met. Clean.

  2. Minor thought on the closure: The compile_backward closure captures bw_compiler, bw_module, and placeholder_list from the enclosing scope. Since cprofile_wrapper calls func(*args, **kwargs) (with no args in this case), the profiled function is the zero-arg closure that calls the actual compiler. This means the profile captures everything inside the backward compiler — which is exactly what's wanted.

test/dynamo/test_profiler.py

  1. Test structure: The test correctly patches subprocess.Popen and subprocess.check_call to avoid requiring gprof2dot/dot in CI, uses tempfile.TemporaryDirectory with patch.object(tempfile, "tempdir", tmpdir) to control profile output location, and verifies both the count and naming prefix of generated profiles. Solid.

  2. One edge case: The test asserts len(new_profiles) == 1, which means it expects only a backward profile (no forward profile in new_profiles). This works because profiles_before captures the forward profile(s) generated during fn(...). If the forward compile didn't generate a profile (e.g., if the maybe_cprofile decorator wasn't applied to _compile_inner), this would still pass since we're only checking what's new after backward(). The logic is correct.

Summary

The PR is well-motivated, the implementation is clean, and the test covers the key behavior. No blocking issues found. The refactoring makes it easy for future compile paths to opt into cProfile profiling. LGTM.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant