Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@liqunfu
Copy link
Collaborator

@liqunfu liqunfu commented Nov 8, 2023

Description

fix #5711 where output size shall not increase fully to padding part of the input region in ceil mode

Motivation and Context

bug fix

@liqunfu liqunfu requested a review from a team as a code owner November 8, 2023 05:38
@justinchuby justinchuby changed the title fix corner case where output size need to reduce by one Fix corner case where output size need to reduce by one Nov 8, 2023
@justinchuby justinchuby added the module: shape inference Issues related to shape inference label Nov 8, 2023
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Nov 9, 2023
In #106270, the solution managed to solve the [`ceil_model` corner issue](onnx/onnx#5711) with the usage of `get_pool_ceil_padding`. However, padding the ceil in converter side only works when we already know the input shapes, therefore, a regression happens when users want to do dynamic inputs.

This PR provides (1) refactor codes with torchlib implementation, (2) add dynamic shapes test, and (3) disable the corner tests with comments saying re-enable it when the [real fix from ONNX](onnx/onnx#5741) is merged.

[ghstack-poisoned]
@codecov
Copy link

codecov bot commented Nov 9, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (8a6b70f) 56.45% compared to head (d5ee359) 56.44%.

Files Patch % Lines
onnx/backend/test/case/node/maxpool.py 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5741      +/-   ##
==========================================
- Coverage   56.45%   56.44%   -0.01%     
==========================================
  Files         504      504              
  Lines       29872    29881       +9     
  Branches     4489     4491       +2     
==========================================
+ Hits        16863    16866       +3     
- Misses      12191    12197       +6     
  Partials      818      818              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

titaiwangms added a commit to pytorch/pytorch that referenced this pull request Nov 9, 2023
In #106270, the solution managed to solve the [`ceil_model` corner issue](onnx/onnx#5711) with the usage of `get_pool_ceil_padding`. However, padding the ceil in converter side only works when we already know the input shapes, therefore, a regression happens when users want to do dynamic inputs.

This PR provides (1) refactor codes with torchlib implementation, (2) add dynamic shapes test, and (3) disable the corner tests with comments saying re-enable it when the [real fix from ONNX](onnx/onnx#5741) is merged.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Nov 9, 2023
… inputs"


In #106270, the solution managed to solve the [`ceil_model` corner issue](onnx/onnx#5711) with the usage of `get_pool_ceil_padding`. However, padding the ceil in converter side only works when we already know the input shapes, therefore, a regression happens when users want to do dynamic inputs.

This PR provides (1) refactor codes with torchlib implementation, (2) add dynamic shapes test, and (3) disable the corner tests with comments saying re-enable it when the [real fix from ONNX](onnx/onnx#5741) is merged.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Nov 9, 2023
In #106270, the solution managed to solve the [`ceil_model` corner issue](onnx/onnx#5711) with the usage of `get_pool_ceil_padding`. However, padding the ceil in converter side only works when we already know the input shapes, therefore, a regression happens when users want to do dynamic inputs.

This PR provides (1) refactor codes with torchlib implementation, (2) add dynamic shapes test, and (3) disable the corner tests with comments saying re-enable it when the [real fix from ONNX](onnx/onnx#5741) is merged.

[ghstack-poisoned]
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Nov 10, 2023
In #106270, the solution managed to solve the [`ceil_model` corner issue](onnx/onnx#5711) with the usage of `get_pool_ceil_padding`. However, padding the ceil in converter side only works when we already know the input shapes, therefore, a regression happens when users want to do dynamic inputs.

This PR provides (1) refactor codes with torchlib implementation, (2) add dynamic shapes test, and (3) disable the corner tests with comments saying re-enable it when the [real fix from ONNX](onnx/onnx#5741) is merged.
Pull Request resolved: #113318
Approved by: https://github.com/thiagocrepaldi
Skylion007 pushed a commit to Skylion007/pytorch that referenced this pull request Nov 14, 2023
In pytorch#106270, the solution managed to solve the [`ceil_model` corner issue](onnx/onnx#5711) with the usage of `get_pool_ceil_padding`. However, padding the ceil in converter side only works when we already know the input shapes, therefore, a regression happens when users want to do dynamic inputs.

This PR provides (1) refactor codes with torchlib implementation, (2) add dynamic shapes test, and (3) disable the corner tests with comments saying re-enable it when the [real fix from ONNX](onnx/onnx#5741) is merged.
Pull Request resolved: pytorch#113318
Approved by: https://github.com/thiagocrepaldi
@justinchuby
Copy link
Member

@fatcat-z do you know this is the expected behavior for tf2onnx?

@justinchuby justinchuby added this to the 1.16 milestone Jan 4, 2024
@justinchuby justinchuby added the review needed: operators approvers Require reviews from members of operators-approvers label Jan 4, 2024
@fatcat-z
Copy link

fatcat-z commented Jan 5, 2024

@fatcat-z do you know this is the expected behavior for tf2onnx?

In TF op MaxPool, there is no parameter to determine ceil_mode. At this moment, I don't see the impact to tf2onnx.

@justinchuby
Copy link
Member

Probably need to skip the test_maxpool_2d_ceil_output_size_reduce_by_one test

@justinchuby justinchuby changed the title Fix corner case where output size need to reduce by one Fix corner case where output size need to reduce by one in MaxPool Jan 17, 2024
@justinchuby justinchuby requested a review from xadupre January 17, 2024 04:18
@liqunfu liqunfu requested a review from a team as a code owner January 18, 2024 19:54
@justinchuby justinchuby enabled auto-merge January 19, 2024 00:25
@justinchuby justinchuby added this pull request to the merge queue Jan 29, 2024
Merged via the queue into main with commit c5d91f6 Jan 29, 2024
@justinchuby justinchuby deleted the liqun/max_pool_outsize branch January 29, 2024 23:30
adrianlizarraga added a commit to microsoft/onnxruntime that referenced this pull request Apr 12, 2024
### Description
update with ONNX 1.16.0 branch according to
https://github.com/microsoft/onnxruntime/blob/main/docs/How_To_Update_ONNX_Dev_Notes.md

ONNX 1.16.0 release notes:
https://github.com/onnx/onnx/releases/tag/v1.16.0

#### Updated ops for CPU EP:
- DequantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block dequantization support
- QuantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block quantization support
- Cast(21)
  - Missing int4 and uint4 support
- CastLike(21)
  - Missing int4 and uint4 support
- ConstantOfShape(21)
  - Missing int4 and uint4 support
- Identity(21)
  - Missing int4 and uint4 support
- If(21)
  - Missing int4 and uint4 support
- Loop(21)
  - Missing int4 and uint4 support
- Reshape(21)
  - Missing int4 and uint4 support
- Scan(21)
  - Missing int4 and uint4 support
- Shape(21)
  - Missing int4 and uint4 support
- Size(21)
  - Missing int4 and uint4 support
- Flatten(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Pad(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Squeeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Transpose(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Unsqueeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support

#### Unimplemented opset 21 features/ops
- int4 and uint4 data type
- QLinearMatMul(21)
- GroupNormalization(21)
- ai.onnx.ml.TreeEnsemble(5)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

### Disabled tests
#### ORT Training

orttraining/orttraining/test/python/orttraining_test_ort_apis_py_bindings.py
- test_ort_custom_ops: Potential shape inference bug for custom ops

#### Python quantization unit tests
test/onnx/python/quantization (shape inference bug)
- test_op_conv_transpose.py: test_quantize_conv_transpose_u8u8_fp16
- test_op_conv_transpose.py: test_quantize_conv_transpose_s8s8_fp16
- test_op_gemm.py: test_quantize_qop_gemm_s8s8
- test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_same
 - test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_p3
- test_op_matmul.py: test_quantize_matmul_u8u8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_entropy
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_percentile
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_distribution
- test_op_relu.py: test_quantize_qop_relu_s8s8

#### ONNX tests
- test_maxpool_2d_ceil_output_size_reduce_by_one: ONNX 1.16.0 fixed a
maxpool output size bug and added this test. Enable this test when [ORT
PR](#18377) is merged.
Refer to original [ONNX PR](onnx/onnx#5741).
- test_ai_onnx_ml_tree_ensemble_set_membership_cpu: new unimplemented op
ai.onnx.ml.TreeEnsemble
- test_ai_onnx_ml_tree_ensemble_single_tree_cpu: same
- test_ai_onnx_ml_tree_ensemble_set_membership_cuda: same
- test_ai_onnx_ml_tree_ensemble_single_tree_cuda: same
- test_cast_INT4_to_FLOAT_cpu: ORT Cast(21) impl doesn't support int4
yet
- test_cast_INT4_to_INT8_cpu: same
- test_cast_UINT4_to_FLOAT_cpu: same
- test_cast_UINT4_to_UINT8_cpu: same
- test_cast_INT4_to_FLOAT_cuda
- test_cast_INT4_to_INT8_cuda
- test_cast_UINT4_to_FLOAT_cuda
- test_cast_UINT4_to_UINT8_cuda
- test_constantofshape_float_ones_cuda: ConstantOfShape(21) not
implemented for cuda
- test_constantofshape_int_shape_zero_cuda: same
- test_constantofshape_int_zeros_cuda: same
- test_flatten_axis0_cuda: Flatten(21) not implemented for cuda
- test_flatten_axis1_cuda: same
- test_flatten_axis2_cuda: same
- test_flatten_axis3_cuda: same
- test_flatten_default_axis_cuda: same
- test_flatten_negative_axis1_cuda: same
- test_flatten_negative_axis2_cuda: same
- test_flatten_negative_axis3_cuda: same
- test_flatten_negative_axis4_cuda: same
- test_qlinearmatmul_2D_int8_float16_cpu: QLinearMatMul(21) for onnx not
implemented in ORT yet
- test_qlinearmatmul_2D_int8_float32_cpu: same
- test_qlinearmatmul_2D_uint8_float16_cpu: same
- test_qlinearmatmul_2D_uint8_float32_cpu: same
- test_qlinearmatmul_3D_int8_float16_cpu: same
- test_qlinearmatmul_3D_int8_float32_cpu: same
- test_qlinearmatmul_3D_uint8_float16_cpu: same
- test_qlinearmatmul_3D_uint8_float32_cpu: same
- test_qlinearmatmul_2D_int8_float16_cuda: same
- test_qlinearmatmul_2D_int8_float32_cuda: same
- test_qlinearmatmul_2D_uint8_float16_cuda: same
- test_qlinearmatmul_2D_uint8_float32_cuda: same
- test_qlinearmatmul_3D_int8_float16_cuda: same
- test_qlinearmatmul_3D_int8_float32_cuda: same
- test_qlinearmatmul_3D_uint8_float16_cuda: same
- test_qlinearmatmul_3D_uint8_float32_cuda: same
- test_size_cuda: Size(21) not implemented for cuda
- test_size_example_cuda: same
- test_dequantizelinear_blocked: Missing implementation for block
dequant for DequantizeLinear(21)
- test_quantizelinear_blocked_asymmetric: Missing implementation for
block quant for QuantizeLinear(21)
- test_quantizelinear_blocked_symmetric: Missing implementation for
block quant for QuantizeLinear(21)

---------

Signed-off-by: liqunfu <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: George Wu <[email protected]>
Co-authored-by: adrianlizarraga <[email protected]>
TedThemistokleous pushed a commit to TedThemistokleous/onnxruntime that referenced this pull request May 7, 2024
### Description
update with ONNX 1.16.0 branch according to
https://github.com/microsoft/onnxruntime/blob/main/docs/How_To_Update_ONNX_Dev_Notes.md

ONNX 1.16.0 release notes:
https://github.com/onnx/onnx/releases/tag/v1.16.0

#### Updated ops for CPU EP:
- DequantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block dequantization support
- QuantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block quantization support
- Cast(21)
  - Missing int4 and uint4 support
- CastLike(21)
  - Missing int4 and uint4 support
- ConstantOfShape(21)
  - Missing int4 and uint4 support
- Identity(21)
  - Missing int4 and uint4 support
- If(21)
  - Missing int4 and uint4 support
- Loop(21)
  - Missing int4 and uint4 support
- Reshape(21)
  - Missing int4 and uint4 support
- Scan(21)
  - Missing int4 and uint4 support
- Shape(21)
  - Missing int4 and uint4 support
- Size(21)
  - Missing int4 and uint4 support
- Flatten(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Pad(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Squeeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Transpose(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Unsqueeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support

#### Unimplemented opset 21 features/ops
- int4 and uint4 data type
- QLinearMatMul(21)
- GroupNormalization(21)
- ai.onnx.ml.TreeEnsemble(5)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

### Disabled tests
#### ORT Training

orttraining/orttraining/test/python/orttraining_test_ort_apis_py_bindings.py
- test_ort_custom_ops: Potential shape inference bug for custom ops

#### Python quantization unit tests
test/onnx/python/quantization (shape inference bug)
- test_op_conv_transpose.py: test_quantize_conv_transpose_u8u8_fp16
- test_op_conv_transpose.py: test_quantize_conv_transpose_s8s8_fp16
- test_op_gemm.py: test_quantize_qop_gemm_s8s8
- test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_same
 - test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_p3
- test_op_matmul.py: test_quantize_matmul_u8u8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_entropy
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_percentile
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_distribution
- test_op_relu.py: test_quantize_qop_relu_s8s8

#### ONNX tests
- test_maxpool_2d_ceil_output_size_reduce_by_one: ONNX 1.16.0 fixed a
maxpool output size bug and added this test. Enable this test when [ORT
PR](microsoft#18377) is merged.
Refer to original [ONNX PR](onnx/onnx#5741).
- test_ai_onnx_ml_tree_ensemble_set_membership_cpu: new unimplemented op
ai.onnx.ml.TreeEnsemble
- test_ai_onnx_ml_tree_ensemble_single_tree_cpu: same
- test_ai_onnx_ml_tree_ensemble_set_membership_cuda: same
- test_ai_onnx_ml_tree_ensemble_single_tree_cuda: same
- test_cast_INT4_to_FLOAT_cpu: ORT Cast(21) impl doesn't support int4
yet
- test_cast_INT4_to_INT8_cpu: same
- test_cast_UINT4_to_FLOAT_cpu: same
- test_cast_UINT4_to_UINT8_cpu: same
- test_cast_INT4_to_FLOAT_cuda
- test_cast_INT4_to_INT8_cuda
- test_cast_UINT4_to_FLOAT_cuda
- test_cast_UINT4_to_UINT8_cuda
- test_constantofshape_float_ones_cuda: ConstantOfShape(21) not
implemented for cuda
- test_constantofshape_int_shape_zero_cuda: same
- test_constantofshape_int_zeros_cuda: same
- test_flatten_axis0_cuda: Flatten(21) not implemented for cuda
- test_flatten_axis1_cuda: same
- test_flatten_axis2_cuda: same
- test_flatten_axis3_cuda: same
- test_flatten_default_axis_cuda: same
- test_flatten_negative_axis1_cuda: same
- test_flatten_negative_axis2_cuda: same
- test_flatten_negative_axis3_cuda: same
- test_flatten_negative_axis4_cuda: same
- test_qlinearmatmul_2D_int8_float16_cpu: QLinearMatMul(21) for onnx not
implemented in ORT yet
- test_qlinearmatmul_2D_int8_float32_cpu: same
- test_qlinearmatmul_2D_uint8_float16_cpu: same
- test_qlinearmatmul_2D_uint8_float32_cpu: same
- test_qlinearmatmul_3D_int8_float16_cpu: same
- test_qlinearmatmul_3D_int8_float32_cpu: same
- test_qlinearmatmul_3D_uint8_float16_cpu: same
- test_qlinearmatmul_3D_uint8_float32_cpu: same
- test_qlinearmatmul_2D_int8_float16_cuda: same
- test_qlinearmatmul_2D_int8_float32_cuda: same
- test_qlinearmatmul_2D_uint8_float16_cuda: same
- test_qlinearmatmul_2D_uint8_float32_cuda: same
- test_qlinearmatmul_3D_int8_float16_cuda: same
- test_qlinearmatmul_3D_int8_float32_cuda: same
- test_qlinearmatmul_3D_uint8_float16_cuda: same
- test_qlinearmatmul_3D_uint8_float32_cuda: same
- test_size_cuda: Size(21) not implemented for cuda
- test_size_example_cuda: same
- test_dequantizelinear_blocked: Missing implementation for block
dequant for DequantizeLinear(21)
- test_quantizelinear_blocked_asymmetric: Missing implementation for
block quant for QuantizeLinear(21)
- test_quantizelinear_blocked_symmetric: Missing implementation for
block quant for QuantizeLinear(21)

---------

Signed-off-by: liqunfu <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: George Wu <[email protected]>
Co-authored-by: adrianlizarraga <[email protected]>
linshokaku pushed a commit to linshokaku/onnx that referenced this pull request Oct 2, 2024
…nnx#5741)

### Description
fix onnx#5711 where output size shall not
increase fully to padding part of the input region in ceil mode

### Motivation and Context
bug fix

---------

Signed-off-by: Liqun Fu <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Co-authored-by: G. Ramalingam <[email protected]>
Signed-off-by: Linsho Kaku <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Jan 24, 2025
### Description
Pooling in reference currently has two bugs: 

(1) it has ["pads
required"](https://github.com/onnx/onnx/blob/e292b4ae6d016c3231a801bfeb26f802ba95d82a/onnx/reference/ops/op_pool_common.py#L53)
to make sure sliding window does not go out-of-bound, but it does not
exclude pads required from pooling caculations. This causes
microsoft/onnxruntime#16203

(2) In `ceil_mode`, the reference pooling pads the input image, but does
not check if sliding window starts on pads. When we reach the end of the
image, the window should stop working. This causes
pytorch/pytorch#131272. Not directly, but
related fix on MaxPooling: #5741.
Detail:
pytorch/pytorch#116420 (comment)

### Motivation and Context
This PR fixes the two bugs and update their tests accordingly. I also
drafted a PR to onnxruntime to have this update in CPU provider:
microsoft/onnxruntime#16752

---------

Signed-off-by: titaiwangms <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Andreas Fehlner <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Co-authored-by: Andreas Fehlner <[email protected]>
andife added a commit that referenced this pull request Jan 25, 2025
### Description
Pooling in reference currently has two bugs:

(1) it has ["pads
required"](https://github.com/onnx/onnx/blob/e292b4ae6d016c3231a801bfeb26f802ba95d82a/onnx/reference/ops/op_pool_common.py#L53)
to make sure sliding window does not go out-of-bound, but it does not
exclude pads required from pooling caculations. This causes
microsoft/onnxruntime#16203

(2) In `ceil_mode`, the reference pooling pads the input image, but does
not check if sliding window starts on pads. When we reach the end of the
image, the window should stop working. This causes
pytorch/pytorch#131272. Not directly, but
related fix on MaxPooling: #5741.
Detail:
pytorch/pytorch#116420 (comment)

### Motivation and Context
This PR fixes the two bugs and update their tests accordingly. I also
drafted a PR to onnxruntime to have this update in CPU provider:
microsoft/onnxruntime#16752

---------

Signed-off-by: titaiwangms <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Andreas Fehlner <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Co-authored-by: Andreas Fehlner <[email protected]>
Signed-off-by: Andreas Fehlner <[email protected]>
seungwoo-ji-03 pushed a commit to seungwoo-ji-03/onnx that referenced this pull request Feb 17, 2025
### Description
Pooling in reference currently has two bugs:

(1) it has ["pads
required"](https://github.com/onnx/onnx/blob/e292b4ae6d016c3231a801bfeb26f802ba95d82a/onnx/reference/ops/op_pool_common.py#L53)
to make sure sliding window does not go out-of-bound, but it does not
exclude pads required from pooling caculations. This causes
microsoft/onnxruntime#16203

(2) In `ceil_mode`, the reference pooling pads the input image, but does
not check if sliding window starts on pads. When we reach the end of the
image, the window should stop working. This causes
pytorch/pytorch#131272. Not directly, but
related fix on MaxPooling: onnx#5741.
Detail:
pytorch/pytorch#116420 (comment)

### Motivation and Context
This PR fixes the two bugs and update their tests accordingly. I also
drafted a PR to onnxruntime to have this update in CPU provider:
microsoft/onnxruntime#16752

---------

Signed-off-by: titaiwangms <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: Andreas Fehlner <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Co-authored-by: Andreas Fehlner <[email protected]>
Signed-off-by: seungwoo-ji <[email protected]>
rohan11235813 pushed a commit to quadric-io/onnxruntime that referenced this pull request Aug 19, 2025
### Description
update with ONNX 1.16.0 branch according to
https://github.com/microsoft/onnxruntime/blob/main/docs/How_To_Update_ONNX_Dev_Notes.md

ONNX 1.16.0 release notes:
https://github.com/onnx/onnx/releases/tag/v1.16.0

#### Updated ops for CPU EP:
- DequantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block dequantization support
- QuantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block quantization support
- Cast(21)
  - Missing int4 and uint4 support
- CastLike(21)
  - Missing int4 and uint4 support
- ConstantOfShape(21)
  - Missing int4 and uint4 support
- Identity(21)
  - Missing int4 and uint4 support
- If(21)
  - Missing int4 and uint4 support
- Loop(21)
  - Missing int4 and uint4 support
- Reshape(21)
  - Missing int4 and uint4 support
- Scan(21)
  - Missing int4 and uint4 support
- Shape(21)
  - Missing int4 and uint4 support
- Size(21)
  - Missing int4 and uint4 support
- Flatten(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Pad(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Squeeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Transpose(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Unsqueeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support

#### Unimplemented opset 21 features/ops
- int4 and uint4 data type
- QLinearMatMul(21)
- GroupNormalization(21)
- ai.onnx.ml.TreeEnsemble(5)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

### Disabled tests
#### ORT Training

orttraining/orttraining/test/python/orttraining_test_ort_apis_py_bindings.py
- test_ort_custom_ops: Potential shape inference bug for custom ops

#### Python quantization unit tests
test/onnx/python/quantization (shape inference bug)
- test_op_conv_transpose.py: test_quantize_conv_transpose_u8u8_fp16
- test_op_conv_transpose.py: test_quantize_conv_transpose_s8s8_fp16
- test_op_gemm.py: test_quantize_qop_gemm_s8s8
- test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_same
 - test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_p3
- test_op_matmul.py: test_quantize_matmul_u8u8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_entropy
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_percentile
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_distribution
- test_op_relu.py: test_quantize_qop_relu_s8s8

#### ONNX tests
- test_maxpool_2d_ceil_output_size_reduce_by_one: ONNX 1.16.0 fixed a
maxpool output size bug and added this test. Enable this test when [ORT
PR](microsoft/onnxruntime#18377) is merged.
Refer to original [ONNX PR](onnx/onnx#5741).
- test_ai_onnx_ml_tree_ensemble_set_membership_cpu: new unimplemented op
ai.onnx.ml.TreeEnsemble
- test_ai_onnx_ml_tree_ensemble_single_tree_cpu: same
- test_ai_onnx_ml_tree_ensemble_set_membership_cuda: same
- test_ai_onnx_ml_tree_ensemble_single_tree_cuda: same
- test_cast_INT4_to_FLOAT_cpu: ORT Cast(21) impl doesn't support int4
yet
- test_cast_INT4_to_INT8_cpu: same
- test_cast_UINT4_to_FLOAT_cpu: same
- test_cast_UINT4_to_UINT8_cpu: same
- test_cast_INT4_to_FLOAT_cuda
- test_cast_INT4_to_INT8_cuda
- test_cast_UINT4_to_FLOAT_cuda
- test_cast_UINT4_to_UINT8_cuda
- test_constantofshape_float_ones_cuda: ConstantOfShape(21) not
implemented for cuda
- test_constantofshape_int_shape_zero_cuda: same
- test_constantofshape_int_zeros_cuda: same
- test_flatten_axis0_cuda: Flatten(21) not implemented for cuda
- test_flatten_axis1_cuda: same
- test_flatten_axis2_cuda: same
- test_flatten_axis3_cuda: same
- test_flatten_default_axis_cuda: same
- test_flatten_negative_axis1_cuda: same
- test_flatten_negative_axis2_cuda: same
- test_flatten_negative_axis3_cuda: same
- test_flatten_negative_axis4_cuda: same
- test_qlinearmatmul_2D_int8_float16_cpu: QLinearMatMul(21) for onnx not
implemented in ORT yet
- test_qlinearmatmul_2D_int8_float32_cpu: same
- test_qlinearmatmul_2D_uint8_float16_cpu: same
- test_qlinearmatmul_2D_uint8_float32_cpu: same
- test_qlinearmatmul_3D_int8_float16_cpu: same
- test_qlinearmatmul_3D_int8_float32_cpu: same
- test_qlinearmatmul_3D_uint8_float16_cpu: same
- test_qlinearmatmul_3D_uint8_float32_cpu: same
- test_qlinearmatmul_2D_int8_float16_cuda: same
- test_qlinearmatmul_2D_int8_float32_cuda: same
- test_qlinearmatmul_2D_uint8_float16_cuda: same
- test_qlinearmatmul_2D_uint8_float32_cuda: same
- test_qlinearmatmul_3D_int8_float16_cuda: same
- test_qlinearmatmul_3D_int8_float32_cuda: same
- test_qlinearmatmul_3D_uint8_float16_cuda: same
- test_qlinearmatmul_3D_uint8_float32_cuda: same
- test_size_cuda: Size(21) not implemented for cuda
- test_size_example_cuda: same
- test_dequantizelinear_blocked: Missing implementation for block
dequant for DequantizeLinear(21)
- test_quantizelinear_blocked_asymmetric: Missing implementation for
block quant for QuantizeLinear(21)
- test_quantizelinear_blocked_symmetric: Missing implementation for
block quant for QuantizeLinear(21)

---------

Signed-off-by: liqunfu <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: George Wu <[email protected]>
Co-authored-by: adrianlizarraga <[email protected]>
rohan11235813 pushed a commit to quadric-io/onnxruntime that referenced this pull request Sep 15, 2025
### Description
update with ONNX 1.16.0 branch according to
https://github.com/microsoft/onnxruntime/blob/main/docs/How_To_Update_ONNX_Dev_Notes.md

ONNX 1.16.0 release notes:
https://github.com/onnx/onnx/releases/tag/v1.16.0

#### Updated ops for CPU EP:
- DequantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block dequantization support
- QuantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block quantization support
- Cast(21)
  - Missing int4 and uint4 support
- CastLike(21)
  - Missing int4 and uint4 support
- ConstantOfShape(21)
  - Missing int4 and uint4 support
- Identity(21)
  - Missing int4 and uint4 support
- If(21)
  - Missing int4 and uint4 support
- Loop(21)
  - Missing int4 and uint4 support
- Reshape(21)
  - Missing int4 and uint4 support
- Scan(21)
  - Missing int4 and uint4 support
- Shape(21)
  - Missing int4 and uint4 support
- Size(21)
  - Missing int4 and uint4 support
- Flatten(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Pad(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Squeeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Transpose(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Unsqueeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support

#### Unimplemented opset 21 features/ops
- int4 and uint4 data type
- QLinearMatMul(21)
- GroupNormalization(21)
- ai.onnx.ml.TreeEnsemble(5)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

### Disabled tests
#### ORT Training

orttraining/orttraining/test/python/orttraining_test_ort_apis_py_bindings.py
- test_ort_custom_ops: Potential shape inference bug for custom ops

#### Python quantization unit tests
test/onnx/python/quantization (shape inference bug)
- test_op_conv_transpose.py: test_quantize_conv_transpose_u8u8_fp16
- test_op_conv_transpose.py: test_quantize_conv_transpose_s8s8_fp16
- test_op_gemm.py: test_quantize_qop_gemm_s8s8
- test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_same
 - test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_p3
- test_op_matmul.py: test_quantize_matmul_u8u8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_entropy
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_percentile
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_distribution
- test_op_relu.py: test_quantize_qop_relu_s8s8

#### ONNX tests
- test_maxpool_2d_ceil_output_size_reduce_by_one: ONNX 1.16.0 fixed a
maxpool output size bug and added this test. Enable this test when [ORT
PR](microsoft/onnxruntime#18377) is merged.
Refer to original [ONNX PR](onnx/onnx#5741).
- test_ai_onnx_ml_tree_ensemble_set_membership_cpu: new unimplemented op
ai.onnx.ml.TreeEnsemble
- test_ai_onnx_ml_tree_ensemble_single_tree_cpu: same
- test_ai_onnx_ml_tree_ensemble_set_membership_cuda: same
- test_ai_onnx_ml_tree_ensemble_single_tree_cuda: same
- test_cast_INT4_to_FLOAT_cpu: ORT Cast(21) impl doesn't support int4
yet
- test_cast_INT4_to_INT8_cpu: same
- test_cast_UINT4_to_FLOAT_cpu: same
- test_cast_UINT4_to_UINT8_cpu: same
- test_cast_INT4_to_FLOAT_cuda
- test_cast_INT4_to_INT8_cuda
- test_cast_UINT4_to_FLOAT_cuda
- test_cast_UINT4_to_UINT8_cuda
- test_constantofshape_float_ones_cuda: ConstantOfShape(21) not
implemented for cuda
- test_constantofshape_int_shape_zero_cuda: same
- test_constantofshape_int_zeros_cuda: same
- test_flatten_axis0_cuda: Flatten(21) not implemented for cuda
- test_flatten_axis1_cuda: same
- test_flatten_axis2_cuda: same
- test_flatten_axis3_cuda: same
- test_flatten_default_axis_cuda: same
- test_flatten_negative_axis1_cuda: same
- test_flatten_negative_axis2_cuda: same
- test_flatten_negative_axis3_cuda: same
- test_flatten_negative_axis4_cuda: same
- test_qlinearmatmul_2D_int8_float16_cpu: QLinearMatMul(21) for onnx not
implemented in ORT yet
- test_qlinearmatmul_2D_int8_float32_cpu: same
- test_qlinearmatmul_2D_uint8_float16_cpu: same
- test_qlinearmatmul_2D_uint8_float32_cpu: same
- test_qlinearmatmul_3D_int8_float16_cpu: same
- test_qlinearmatmul_3D_int8_float32_cpu: same
- test_qlinearmatmul_3D_uint8_float16_cpu: same
- test_qlinearmatmul_3D_uint8_float32_cpu: same
- test_qlinearmatmul_2D_int8_float16_cuda: same
- test_qlinearmatmul_2D_int8_float32_cuda: same
- test_qlinearmatmul_2D_uint8_float16_cuda: same
- test_qlinearmatmul_2D_uint8_float32_cuda: same
- test_qlinearmatmul_3D_int8_float16_cuda: same
- test_qlinearmatmul_3D_int8_float32_cuda: same
- test_qlinearmatmul_3D_uint8_float16_cuda: same
- test_qlinearmatmul_3D_uint8_float32_cuda: same
- test_size_cuda: Size(21) not implemented for cuda
- test_size_example_cuda: same
- test_dequantizelinear_blocked: Missing implementation for block
dequant for DequantizeLinear(21)
- test_quantizelinear_blocked_asymmetric: Missing implementation for
block quant for QuantizeLinear(21)
- test_quantizelinear_blocked_symmetric: Missing implementation for
block quant for QuantizeLinear(21)

---------

Signed-off-by: liqunfu <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: Ganesan Ramalingam <[email protected]>
Co-authored-by: George Wu <[email protected]>
Co-authored-by: adrianlizarraga <[email protected]>
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Sep 30, 2025
…ode=True (#163186)

# Summary

- Add a note to each `nn.LPPool*d` docstring explaining how `ceil_mode=True` interacts with right padding.
- Mirror the same clarification in the `torch.nn.functional.lp_pool*` docstrings so the rendered functional docs stay in sync.

# Motivation

The current PyTorch spec for **LPPool** does not fully match runtime behavior, which has led to downstream confusion in other specs (e.g., ONNX) and runtimes (e.g., [onnxruntime issue #25848](microsoft/onnxruntime#25848)). A corresponding clarification was also made in the ONNX spec: [onnx/onnx#5741](onnx/onnx#5741).

PyTorch’s **LPPool** implementation calls into **AvgPool**, which enforces the rule that windows starting entirely in the right padded region are ignored when `ceil_mode=True`. As a result, **LPPool** inherits the same behavior.

This is an edge case where the output size formula shown in the LPPool docs/spec is not sufficient on its own. Without the added caveat, the documentation is technically incorrect. This PR brings the LPPool docs in line with actual behavior.

Note that this is a trivial fix to the spec as all major implementers of the spec adhere to this caveat.

For comparison, both **MaxPool** and **AvgPool** already include this clarification in their spec. Their docstrings explicitly state:

> *When `ceil_mode=True`, sliding windows are allowed to go off-bounds if they start within the left padding or the input. Sliding windows that would start in the right padded region are ignored.*

Adding the same note to LPPool ensures consistency across all pooling operators.

Pull Request resolved: #163186
Approved by: https://github.com/mikaylagawarecki
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: shape inference Issues related to shape inference review needed: operators approvers Require reviews from members of operators-approvers run release CIs Use this label to trigger release tests in CI

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

MaxPool should be corrected and documented properly

4 participants