Add detailed triton kernel logging to tlparse #152197

jamesjwu · 2025-04-25T17:10:58Z

Stack from ghstack (oldest at bottom):

-> Add detailed triton kernel logging to tlparse #152197

This PR adds detailed logging of each triton kernel we compile, and its autotune result, to every kernel we compile with triton. We add these results to a global variable that we then clear after each triton kernel compile.

We can't keep these objects around after compile time, so we can't record the autotune cache save or coordinate descent tuning, unfortunately, but we can log at least:

The duration of compilation
Whether or not autotune cache hit
The best autotuning config, if there's only one.

Example triton kernel info: https://gist.github.com/jamesjwu/493bdd0f36b0b7e3ca327f87bd6c2c75

See internal diff for an example log for internal model.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

Differential Revision: D73674443

[ghstack-poisoned]

pytorch-bot · 2025-04-25T17:11:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152197

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 795a706 with merge base 6efc572 ():

NEW FAILURES - The following jobs have failed:

pull / linux-focal-cuda12.6-py3.10-gcc11-sm89 / test (default, 1, 5, ephemeral.linux.g6.4xlarge.experimental.nvidia.gpu) (gh)
inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer_cpp_wrapper_non_persistent_reduction
pull / linux-focal-cuda12.6-py3.10-gcc11-sm89 / test (default, 2, 5, ephemeral.linux.g6.4xlarge.experimental.nvidia.gpu) (gh)
dynamo/test_structured_trace.py::StructuredTraceTest::test_cudagraphs

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 15ca7c4 Pull Request resolved: #152197

[ghstack-poisoned]

ghstack-source-id: 686bf1b Pull Request resolved: #152197

jamesjwu · 2025-04-25T17:20:05Z

@jamesjwu has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

[ghstack-poisoned]

ghstack-source-id: a0cd75c Pull Request resolved: #152197

jamesjwu · 2025-04-25T18:54:12Z

@jamesjwu has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

[ghstack-poisoned]

ghstack-source-id: 0d29593 Pull Request resolved: #152197

[ghstack-poisoned]

ghstack-source-id: 10b4037 Pull Request resolved: #152197

[ghstack-poisoned]

ghstack-source-id: c7fc4c0 Pull Request resolved: #152197

jamesjwu · 2025-04-28T15:41:46Z

@jamesjwu has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

bdhirsh · 2025-04-28T18:12:34Z

torch/_inductor/runtime/triton_heuristics.py

+                autotune_cache_info["num_configs"] = len(configs)
+                if inductor_meta.get("coordinate_descent_tuning"):
+                    autotune_cache_info["coordesc_tuning"] = True
+                    if len(configs) == 1:


basic question: given that we are logging the results of autotuning, what does it actually mean for there to be more than one config here? (shouldn't autotuning always end in a single config we can log?)

We're logging the compile time "results", in that we're logging all the possible configs we need to actually autotune when the function is actually called. But we haven't run autotuning yet, so there can be more than one config.

We run autotuning later after dynamo returns, in CachingAutotuner.benchmark_all_configs. There, it should be possible to log just the best config.

facebook-github-bot · 2025-04-29T17:32:32Z

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

pytorchmergebot · 2025-04-29T17:34:33Z

Merge started

Your change will be merged while ignoring the following 0 checks:

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

jamesjwu · 2025-04-29T18:14:39Z

@pytorchbot merge -f -c "Buggy lints not running, checked that lints passed locally"

pytorch-bot · 2025-04-29T18:14:41Z

❌ 🤖 pytorchbot command failed:

@pytorchbot merge: error: argument -f/--force: expected one argument

usage: @pytorchbot merge [-f MESSAGE | -i] [-ic] [-r [{viable/strict,main}]]

Try @pytorchbot --help for more info.

jamesjwu · 2025-04-29T18:14:54Z

@pytorchbot merge -f "Buggy lints not running, checked that lints passed locally"

pytorchmergebot · 2025-04-29T18:15:12Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

pytorchmergebot · 2025-04-29T18:16:44Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

wdvr · 2025-04-29T22:46:09Z

@pytorchmergebot revert -m 'failing python test/dynamo/test_structured_trace.py StructuredTraceTest.test_cudagraphs on trunk' -c nosignal

dynamo/test_structured_trace.py::StructuredTraceTest::test_cudagraphs GH job link HUD commit link

cc @jamesjwu

pytorchmergebot · 2025-04-29T22:47:39Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2025-04-29T22:47:50Z

@jamesjwu your PR has been successfully reverted.

This reverts commit 8303860. Reverted #152197 on behalf of https://github.com/wdvr due to failing python test/dynamo/test_structured_trace.py StructuredTraceTest.test_cudagraphs on trunk ([comment](#152197 (comment)))

Update

df0fe15

[ghstack-poisoned]

jamesjwu added a commit that referenced this pull request Apr 25, 2025

Detailed triton kernel logging

9ee2bce

ghstack-source-id: 15ca7c4 Pull Request resolved: #152197

pytorch-bot bot added ciflow/inductor module: inductor release notes: AO frontend labels Apr 25, 2025

Update

266a4b2

[ghstack-poisoned]

jamesjwu added a commit that referenced this pull request Apr 25, 2025

Detailed triton kernel logging

b278328

ghstack-source-id: 686bf1b Pull Request resolved: #152197

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 25, 2025

Update

d07aa5b

[ghstack-poisoned]

jamesjwu added a commit that referenced this pull request Apr 25, 2025

Detailed triton kernel logging

65d81b7

ghstack-source-id: a0cd75c Pull Request resolved: #152197

Update

e092a84

[ghstack-poisoned]

jamesjwu added a commit that referenced this pull request Apr 28, 2025

Detailed triton kernel logging

a2de9d9

ghstack-source-id: 0d29593 Pull Request resolved: #152197

jamesjwu changed the title ~~Detailed triton kernel logging~~ Add detailed triton kernel logging to tlparse Apr 28, 2025

Update

1a39ed2

[ghstack-poisoned]

jamesjwu added a commit that referenced this pull request Apr 28, 2025

Detailed triton kernel logging

a0921de

ghstack-source-id: 10b4037 Pull Request resolved: #152197

jamesjwu requested review from oulgen, eellison and masnesral April 28, 2025 15:38

jamesjwu marked this pull request as ready for review April 28, 2025 15:38

jamesjwu requested a review from bdhirsh as a code owner April 28, 2025 15:38

Update

795a706

[ghstack-poisoned]

jamesjwu added a commit that referenced this pull request Apr 28, 2025

Detailed triton kernel logging

a9813f7

ghstack-source-id: c7fc4c0 Pull Request resolved: #152197

oulgen approved these changes Apr 28, 2025

View reviewed changes

bdhirsh reviewed Apr 28, 2025

View reviewed changes

eellison approved these changes Apr 28, 2025

View reviewed changes

pytorchmergebot added the merging label Apr 29, 2025

pytorchmergebot added the Merged label Apr 29, 2025

pytorchmergebot closed this in 8303860 Apr 29, 2025

pytorchmergebot removed the merging label Apr 29, 2025

pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Apr 29, 2025

pytorchmergebot reopened this Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add detailed triton kernel logging to tlparse #152197

Add detailed triton kernel logging to tlparse #152197

jamesjwu commented Apr 25, 2025 •

edited

Loading

pytorch-bot bot commented Apr 25, 2025 •

edited

Loading

jamesjwu commented Apr 25, 2025

jamesjwu commented Apr 25, 2025

jamesjwu commented Apr 28, 2025

bdhirsh Apr 28, 2025

jamesjwu Apr 29, 2025 •

edited

Loading

facebook-github-bot commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

jamesjwu commented Apr 29, 2025

pytorch-bot bot commented Apr 29, 2025

jamesjwu commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

wdvr commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

Add detailed triton kernel logging to tlparse #152197

Are you sure you want to change the base?

Add detailed triton kernel logging to tlparse #152197

Conversation

jamesjwu commented Apr 25, 2025 • edited Loading

pytorch-bot bot commented Apr 25, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152197

❌ 2 New Failures

jamesjwu commented Apr 25, 2025

jamesjwu commented Apr 25, 2025

jamesjwu commented Apr 28, 2025

bdhirsh Apr 28, 2025

Choose a reason for hiding this comment

jamesjwu Apr 29, 2025 • edited Loading

Choose a reason for hiding this comment

facebook-github-bot commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

Merge started

jamesjwu commented Apr 29, 2025

pytorch-bot bot commented Apr 29, 2025

jamesjwu commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

Merge started

wdvr commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

pytorchmergebot commented Apr 29, 2025

jamesjwu commented Apr 25, 2025 •

edited

Loading

pytorch-bot bot commented Apr 25, 2025 •

edited

Loading

jamesjwu Apr 29, 2025 •

edited

Loading