Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@MengAiDev
Copy link

Description

When converting to float8 types, NaN values were not properly handled and would remain as NaN in the quantized tensors. This caused model validation to fail. The fix explicitly replaces NaN values with 0 before clipping in the saturate_cast function for float8 types.

Motivation and Context

#7222

When converting to float8 types, NaN values were not properly handled and would remain as NaN in the quantized tensors. This caused model validation to fail. The fix explicitly replaces NaN values with 0 before clipping in the saturate_cast function for float8 types.
@MengAiDev MengAiDev requested a review from a team as a code owner August 12, 2025 08:00
@github-project-automation github-project-automation bot moved this to In progress in PR Tracker Aug 12, 2025
@justinchuby
Copy link
Member

Thanks - the current behavior seems intended: https://onnx.ai/onnx/technical/float8.html#cast

Is there an example from a different framework that will turn NaNs into 0s for reference?

@justinchuby
Copy link
Member

Why would the original value be NaN in the first place? Can that be fixed?

Copy link
Member

@justinchuby justinchuby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking for now

@github-project-automation github-project-automation bot moved this from In progress to Review in progress in PR Tracker Aug 12, 2025
@MengAiDev MengAiDev requested a review from a team as a code owner August 13, 2025 04:08
@MengAiDev
Copy link
Author

I have fix it, add an arg.

@justinchuby
Copy link
Member

justinchuby commented Aug 13, 2025

I still don’t think this is the right fix. Does this behavior (NaN to 0) exist in other frameworks? Should the quantization tool itself be fixed?

I am also still wondering where did the NaN values come from

@justinchuby
Copy link
Member

justinchuby commented Aug 13, 2025

If it’s due to division by zero, then the actual computation (in the call site, not in onnx) needs to be fixed

@MengAiDev
Copy link
Author

Sorry, I can't understand the meaning, I think it's difficult to see why NaN happens. And other kinds of packages also has the feature to make NaN to 0.

@andife
Copy link
Member

andife commented Aug 15, 2025

Sorry, I can't understand the meaning, I think it's difficult to see why NaN happens. And other kinds of packages also has the feature to make NaN to 0.

Out of interest, which package does this?

@cyyever
Copy link
Contributor

cyyever commented Oct 9, 2025

NaN should be kept as NaN in down-casting. NaN indicates previous computing errors, hiding NaN by 0 makes no sense. The current behaviour is the same as numpy.

import numpy as np


# Casting NaN to another float type
arr_float64 = np.array([1.0, np.nan], dtype=np.float64)
arr_float32 = arr_float64.astype(np.float32)
print(f"Original float64 array with NaN: {arr_float64}")
print(f"Casting to float32: {arr_float32}")  # NaN is preserved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Review in progress

Development

Successfully merging this pull request may close these issues.

4 participants