Codestin Search App

rerunGithubInfraFailure/8655253217

Pass triton kernel info to record function (pytorch#123871)

Summary:

This DIFF is to pass triton kernel information, such as kernel python file, kernel type, grid, and stream, to record_function. With these information, Execution trace can capture triton kernel and replay it in PARAM.

Test Plan:
unit test
    buck2 test mode/opt caffe2/test:profiler -- test_record_function_fast

Reviewed By: sraikund16

Differential Revision: D56021651

Apr 12, 2024
474d719
zip
tar.gz

rerunGithubInfraFailure/8655213705

Update on "Enable dynamo test_state_dict_deterministic"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

Apr 12, 2024
508dc7a
zip
tar.gz

rerunGithubInfraFailure/8655211558

Update on "[compiled autograd][dynamo] Codegen aliases to keep grad m…

…utated tensors alive"


The current codegen is problematic if __compiled_fn_0 clears the inputs list, since we need it for assignment afterwards
```python
def forward(inputs):
    __compiled_fn_0 = ...  # The actual function needs to be provided
    graph_out_0 = __compiled_fn_0(inputs)  # clears inputs
    temp_list = []
    temp_list.append(graph_out_0[0])
    inputs[4].grad = graph_out_0[1]  # inputs is empty, index error
    inputs[7].grad = graph_out_0[2]
    inputs[8].grad = graph_out_0[3]
    inputs[9].grad = graph_out_0[3]
    del graph_out_0
    return temp_list
```

With this fix, we use aliases to keep the tensors alive
```python
def forward(inputs):
    __compiled_fn_0 = ...  # The actual function needs to be provided
    inputs_ref_1 = inputs[9]
    inputs_ref_2 = inputs[4]
    inputs_ref_3 = inputs[8]
    inputs_ref_4 = inputs[7]
    graph_out_0 = __compiled_fn_0(inputs)
    temp_list = []
    temp_list.append(graph_out_0[0])
    inputs_ref_2.grad = graph_out_0[1]
    inputs_ref_4.grad = graph_out_0[2]
    inputs_ref_3.grad = graph_out_0[3]
    inputs_ref_1.grad = graph_out_0[3]
    del graph_out_0
    return temp_list
```

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

Apr 12, 2024
b8bad74
zip
tar.gz

rerunGithubInfraFailure/8655211556

Update on "[compiled autograd][dynamo] Codegen aliases to keep grad m…

…utated tensors alive"


The current codegen is problematic if __compiled_fn_0 clears the inputs list, since we need it for assignment afterwards
```python
def forward(inputs):
    __compiled_fn_0 = ...  # The actual function needs to be provided
    graph_out_0 = __compiled_fn_0(inputs)  # clears inputs
    temp_list = []
    temp_list.append(graph_out_0[0])
    inputs[4].grad = graph_out_0[1]  # inputs is empty, index error
    inputs[7].grad = graph_out_0[2]
    inputs[8].grad = graph_out_0[3]
    inputs[9].grad = graph_out_0[3]
    del graph_out_0
    return temp_list
```

With this fix, we use aliases to keep the tensors alive
```python
def forward(inputs):
    __compiled_fn_0 = ...  # The actual function needs to be provided
    inputs_ref_1 = inputs[9]
    inputs_ref_2 = inputs[4]
    inputs_ref_3 = inputs[8]
    inputs_ref_4 = inputs[7]
    graph_out_0 = __compiled_fn_0(inputs)
    temp_list = []
    temp_list.append(graph_out_0[0])
    inputs_ref_2.grad = graph_out_0[1]
    inputs_ref_4.grad = graph_out_0[2]
    inputs_ref_3.grad = graph_out_0[3]
    inputs_ref_1.grad = graph_out_0[3]
    del graph_out_0
    return temp_list
```

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

Apr 12, 2024
b8bad74
zip
tar.gz

ciflow/xpu/123929

Replaces all invocations of _linux-build.yml with _linux-build-label.yml

Apr 12, 2024
ed2f82b
zip
tar.gz

ciflow/xpu/122866

Update on "[WIP][Inductor Intel GPU backend Upstream] Reuse inductor …

…test for Intel GPU (PART 1)"

backend.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

Apr 12, 2024
b068b32
zip
tar.gz

ciflow/trunk/123929

Replaces all invocations of _linux-build.yml with _linux-build-label.yml

Apr 12, 2024
ed2f82b
zip
tar.gz

ciflow/trunk/123926

1. Enable EFMT on test/test_functionalization.py

Apr 12, 2024
8e78044
zip
tar.gz

ciflow/trunk/123923

Update

[ghstack-poisoned]

Apr 12, 2024
6053fa4
zip
tar.gz

ciflow/trunk/123914

Normalize remote/local cache names (pytorch#123914)

Summary: Pull Request resolved: pytorch#123914

Test Plan: ci

Differential Revision: D56027380

Apr 12, 2024
a688b3c
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

rerunGithubInfraFailure/8655253217

rerunGithubInfraFailure/8655213705

rerunGithubInfraFailure/8655211558

rerunGithubInfraFailure/8655211556

ciflow/xpu/123929

ciflow/xpu/122866

ciflow/trunk/123929

ciflow/trunk/123926

ciflow/trunk/123923

ciflow/trunk/123914

Uh oh!

Tags: collabora/pytorch