Thanks to visit codestin.com
Credit goes to github.com

Skip to content

device-libs: Guard trig reduction quadrant index against NaN UB#2752

Open
adelejjeh wants to merge 1 commit into
amd-stagingfrom
amd/dev/aejjeh/device-libs/fix-trig-nan-ub
Open

device-libs: Guard trig reduction quadrant index against NaN UB#2752
adelejjeh wants to merge 1 commit into
amd-stagingfrom
amd/dev/aejjeh/device-libs/fix-trig-nan-ub

Conversation

@adelejjeh
Copy link
Copy Markdown

@adelejjeh adelejjeh commented Jun 1, 2026

Guard the (int)fn & 0x3 quadrant index computation in trig reduction
functions against NaN input. fptosi NaN is UB in C and produces
poison in LLVM IR, which the compiler exploits during constant-folding
to return garbage from cos(inf), sin(inf), etc.

Fix: ret.i = BUILTIN_ISNAN(fn) ? 0 : ((int)fn & 0x3);

Applied to 7 locations across 6 files (trigredsmall F/D, trigred H,
trigpired F/D/H). Upstream PR llvm#201435 pattern matches the isnan
guard replacing it with the saturating intrinsic which removes the UB.

Verified: identical instruction count, all reproducer variants pass.

Fixes: LCOMPILER-2150

Guard the `(int)fn & 0x3` quadrant index computation in trig reduction
functions against NaN input. `fptosi NaN` is UB in C and produces
`poison` in LLVM IR, which the compiler exploits during constant-folding
to return garbage from `cos(inf)`, `sin(inf)`, etc.

Fix by adding an isnan check: `isnan(fn) ? 0 : ((int)fn & 0x3)`. The
AMDGPU backend folds away the guard at codegen since v_cvt_i32_f32
already returns 0 for NaN (see llvm#200960).

Fixes: LCOMPILER-2150

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@adelejjeh adelejjeh force-pushed the amd/dev/aejjeh/device-libs/fix-trig-nan-ub branch from 69edd27 to cf5e0cb Compare June 1, 2026 22:12
@adelejjeh adelejjeh changed the title device-libs: Use saturating float-to-int casts to avoid NaN UB device-libs: Guard trig reduction quadrant index against NaN UB Jun 1, 2026
@adelejjeh adelejjeh marked this pull request as ready for review June 1, 2026 22:22
@adelejjeh adelejjeh requested review from b-sumner and lamb-j as code owners June 1, 2026 22:22
@arsenm arsenm added the device-libs Related to Device Libraries label Jun 2, 2026
struct redret ret;
ret.hi = MATH_MAD(t, -0.5, x);
ret.i = (int)t & 0x3;
ret.i = BUILTIN_ISNAN_F64(t) ? 0 : ((int)t & 0x3);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rewrite this as is-inf-or-nan(x)? It's harder to prove that t isn't a nan based on the input, but only inf or nan inputs should result in nan results

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want this statement to result in any instructions besides the cvt_i32_f64 and similarly for the other types.

Copy link
Copy Markdown
Author

@adelejjeh adelejjeh Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@b-sumner The upstream PR handles pattern matching the generated LLVM IR and replacing it with a single llvm.fptosi.sat

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm if we change the check to check x instead of t it would make it harder to pattern match and replace with the saturating intrinsic.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultimately, the pattern matches in instcombine will fold the resulting checks and we will result in the same, just an fptosi.sat instead of fptosi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

device-libs Related to Device Libraries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants