-
Notifications
You must be signed in to change notification settings - Fork 24.1k
[Don't merge]Upgrade submodule oneDNN to v3.7 (#147498)(Z7) #148163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR is to upgrade submodule oneDNN to v3.7. ## Improvements - Improved performance of convolution and matmul primitives on Intel Xeon processors with Intel AMX instruction set support (formerly Sapphire Rapids and Granite Rapids). - Improved performance of int8 and fp32 forward convolution primitive on processors with Intel AVX2 instruction set support. - Improved performance of fp8 matmul primitives with bf16 and fp16 bias data type on Intel Xeon processors with Intel AMX instruction set support (formerly Sapphire Rapids and Granite Rapids). - Introduced initial optimizations for Intel GPUs based on Xe3 architecture. - Added bfloat16 support for SDPA, implemented fp16 and bf16 gemm kernel in SDPA. - Fixed f16 matmul accuracy, the issue of SDPA cannot dispatched to ukernel, bf16/fp16/fp32 conv performance, INT8 Kernel trigger page fault, deconvolution precision issue on complex128 and fp64 and gemm correctness issue in float16 issues. - Improved bf16 matmul performance with fp32 destination with Arm Compute Library (ACL). - Improved bf16 to fp32 reorder performance. - Improved bf16 reorder performance. - Improved bf16 convolution with ACL. Fixes pytorch#136348. ## Validation results on CPU 1. NLP models accuracy/inference/training   2. Torchbench cpu userbenchmark inference & training  3. Inductor quantization  4. Dynamo benchmarks         ## Validation results on XPU Accuracy is same as baseline. Performance is shown below.  ## Validation results on ARM   Pull Request resolved: pytorch#147498 Approved by: https://github.com/fadara01, https://github.com/mingfeima, https://github.com/atalman
π Helpful Linksπ§ͺ See artifacts and rendered test results at hud.pytorch.org/pr/148163
Note: Links to docs will display an error until the docs builds have been completed. β 1 New Failure, 1 Unrelated FailureAs of commit 081d7ff with merge base b533bb4 ( NEW FAILURE - The following job has failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
Fixes #ISSUE_NUMBER
cc @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @snadampal @voznesenskym @penguinwu @EikanWang @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov