[Quant][X86] add an op to compute uint8 batch norm 2d #152811

Xia-Weiwen · 2025-05-05T08:12:10Z

Stack from ghstack (oldest at bottom):

Summary
This PR adds a new op, onednn.qbatch_norm2d, which accepts uint8 inputs on CPU device (instead of QuantizedCPU).
The new ops are implemented with AVX512 instructions and it provides similar performance as its counterpart for QuantizedCPU device quantized.batch_norm2d.
The new op supports output dtypes other than uint8 (fp32, fp16 and bf16 are supported).

Test plan

pytest test/quantization/core/test_quantized_op.py -k test_int8_batch_norm_onednn

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168

[ghstack-poisoned]

pytorch-bot · 2025-05-05T08:12:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152811

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ab2a1c8 with merge base f5e0806 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 949cb72 Pull Request resolved: #152811

[ghstack-poisoned]

ghstack-source-id: dfcf0db Pull Request resolved: #152811

[ghstack-poisoned]

ghstack-source-id: 3b393ae Pull Request resolved: #152811

[ghstack-poisoned]

ghstack-source-id: 0c2d840 Pull Request resolved: #152811

[ghstack-poisoned]

ghstack-source-id: d3e06ef Pull Request resolved: #152811

aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp

[ghstack-poisoned]

ghstack-source-id: f1f01f2 Pull Request resolved: #152811

[ghstack-poisoned]

ghstack-source-id: 1d1f4ec Pull Request resolved: #152811

jgong5 · 2025-05-08T15:36:19Z

aten/src/ATen/native/quantized/cpu/Normalization.cpp

@@ -407,4 +489,8 @@ TORCH_LIBRARY_IMPL(quantized, QuantizedCPU, m) {
  m.impl(TORCH_SELECTIVE_NAME("quantized::batch_norm3d_relu"), TORCH_FN(q_batch_norm3d_impl<true>));
 }

+TORCH_LIBRARY_IMPL(onednn, CPU, m) {
+  m.impl(TORCH_SELECTIVE_NAME("onednn::qbatch_norm2d"), TORCH_FN(int8_batch_norm2d_cpu_impl));


hmm, even though the op is put under "onednn" namespace but seems the implementation does not rely on onednn at all?

Yes, that's true. It's because all these ops used for PT2E quantization are put under the onednn namespace, and they should only be used when using the X86Inductor backend. We decide to keep that practice. Do you have any concerns? Thanks.

Perhaps we can separate the "op" namespace and the quantization backend such as "onednn". For op, "onednn" means that the implementation relies on the oneDNN library and don't have to put other quantized ops under "onednn" namespace. On the other hand, the quantization support with "onednn" quant backend can leverage ops implemented with oneDNN library and native ATen. What do you think?

Thanks for the suggestion. It makes sense to me. How would you suggest we name the namespace? x86 or still quantized? cc @leslie-fang-intel @jerryzh168

Hi @jerryzh168 @leslie-fang-intel Do you think it's ok to use onednn namespace even if the we don't use oneDNN library for implementation? If not, do you have any suggestions on name of the namespace? Thanks.

btw, it could be a long-term refactoring work, not necessarily blocking this PR landing.

it can be x86 I think, since it's used in x86 inductor backend, we are deprecating the quantized namespace as we deprecate eager and fx graph mode quant in pytorch

@jgong5 @jerryzh168 Thanks for your suggestions. I will keep this PR as is and refactor this part in a future PR.

[ghstack-poisoned]

ghstack-source-id: d1bd152 Pull Request resolved: #152811

Xia-Weiwen · 2025-05-16T03:38:27Z

@pytorchbot merge

pytorchmergebot · 2025-05-16T03:40:40Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Update

6b529a9

[ghstack-poisoned]

Xia-Weiwen requested review from jerryzh168, salilsdesai, kimishpatel, digantdesai and jianyuh as code owners May 5, 2025 08:12

pytorch-bot bot added module: cpu CPU specific problem (e.g., perf, algorithm) release notes: quantization release notes category labels May 5, 2025

Xia-Weiwen mentioned this pull request Apr 29, 2025

[Quant][X86] add ops to compute uint8 pointwise add/add_relu #152411

Closed

Xia-Weiwen added a commit that referenced this pull request May 5, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

3fdf4fb

ghstack-source-id: 949cb72 Pull Request resolved: #152811

Xia-Weiwen marked this pull request as draft May 5, 2025 08:13

Xia-Weiwen removed request for digantdesai, jianyuh, jerryzh168, kimishpatel and salilsdesai May 5, 2025 08:13

pytorchbot added the open source label May 5, 2025

Update

640a265

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request May 5, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

df242dd

ghstack-source-id: dfcf0db Pull Request resolved: #152811

Update

1fc633c

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request May 6, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

a1aae5d

ghstack-source-id: 3b393ae Pull Request resolved: #152811

Update

42735f1

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request May 6, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

d75206d

ghstack-source-id: 0c2d840 Pull Request resolved: #152811

Xia-Weiwen added a commit that referenced this pull request May 6, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

35e726c

ghstack-source-id: 0c2d840 Pull Request resolved: #152811

Xia-Weiwen added the intel This tag is for PR from Intel label May 6, 2025

Update

e98c4ed

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request May 6, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

7fa64b7

ghstack-source-id: d3e06ef Pull Request resolved: #152811

Xia-Weiwen requested a review from leslie-fang-intel May 7, 2025 01:17

leslie-fang-intel approved these changes May 7, 2025

View reviewed changes

aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp Outdated Show resolved Hide resolved

Update

7298531

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request May 7, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

aa68fdd

ghstack-source-id: f1f01f2 Pull Request resolved: #152811

Update

6770c6e

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request May 7, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

eddb3f6

ghstack-source-id: 1d1f4ec Pull Request resolved: #152811

jgong5 reviewed May 8, 2025

View reviewed changes

Xia-Weiwen marked this pull request as ready for review May 9, 2025 01:14

Xia-Weiwen requested review from jgong5 and jerryzh168 May 9, 2025 01:14

jerryzh168 approved these changes May 13, 2025

View reviewed changes

jgong5 approved these changes May 14, 2025

View reviewed changes

jerryzh168 approved these changes May 15, 2025

View reviewed changes

Update

ab2a1c8

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request May 15, 2025

[Quant][X86] add an op to compute uint8 batch norm 2d

5d2497d

ghstack-source-id: d1bd152 Pull Request resolved: #152811

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 16, 2025

pytorchmergebot added the merging label May 16, 2025

pytorchmergebot added the Merged label May 16, 2025

pytorchmergebot closed this in 1a722f6 May 16, 2025

pytorchmergebot removed the merging label May 16, 2025

github-actions bot deleted the gh/Xia-Weiwen/39/head branch June 19, 2025 02:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Quant][X86] add an op to compute uint8 batch norm 2d #152811

[Quant][X86] add an op to compute uint8 batch norm 2d #152811

Uh oh!

Xia-Weiwen commented May 5, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented May 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

jgong5 May 8, 2025

Uh oh!

Xia-Weiwen May 9, 2025

Uh oh!

jgong5 May 9, 2025

Uh oh!

Xia-Weiwen May 9, 2025

Uh oh!

Xia-Weiwen May 12, 2025

Uh oh!

jgong5 May 13, 2025

Uh oh!

jerryzh168 May 13, 2025 •

edited

Loading

Uh oh!

Xia-Weiwen May 13, 2025

Uh oh!

Xia-Weiwen commented May 16, 2025

Uh oh!

pytorchmergebot commented May 16, 2025

Uh oh!

Uh oh!

[Quant][X86] add an op to compute uint8 batch norm 2d #152811

[Quant][X86] add an op to compute uint8 batch norm 2d #152811

Uh oh!

Conversation

Xia-Weiwen commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152811

✅ No Failures

Uh oh!

Uh oh!

jgong5 May 8, 2025

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen May 9, 2025

Choose a reason for hiding this comment

Uh oh!

jgong5 May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen May 12, 2025

Choose a reason for hiding this comment

Uh oh!

jgong5 May 13, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen May 13, 2025

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen commented May 16, 2025

Uh oh!

pytorchmergebot commented May 16, 2025

Merge started

Uh oh!

Uh oh!

Xia-Weiwen commented May 5, 2025 •

edited

Loading

pytorch-bot bot commented May 5, 2025 •

edited

Loading

jerryzh168 May 13, 2025 •

edited

Loading