Codestin Search App

HanchengWu · 2026-04-23T18:56:58Z

Background

tosa.resize in bilinear integer (quantized) mode lowers to a linalg.generic
body that, for each output pixel, computes a corresponding input coordinate and
blends the four neighboring input pixels. The mapping is:

val   = out_coord * scale_d + offset
index = val / scale_n          // integer part — which input pixel to start from
delta = val - index * scale_n  // fractional part, scaled to [0, scale_n)

delta is the interpolation weight toward the next pixel. The bilinear formula
(integer path) is:

topAcc    = pixel[y0,x0] * (scale_x - dx) + pixel[y0,x1] * dx
bottomAcc = pixel[y1,x0] * (scale_x - dx) + pixel[y1,x1] * dx
result    = topAcc * (scale_y - dy) + bottomAcc * dy

For this to be a valid convex combination (interpolation, not extrapolation),
dx and dy must be in [0, scale_n].

The pixel indices y0, y1, x0, x1 are computed by getClampedIdxs:

y0 = clamp(iy,   0, H-1)
y1 = clamp(iy+1, 0, H-1)

The Bug

The integer path uses DivSIOp (truncation toward zero):

// getIndexAndDeltaInt — mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index = arith::DivSIOp::create(b, val, scaleN);   // truncates toward zero
delta = arith::MulIOp::create(b, index, scaleN);
delta = arith::SubIOp::create(b, val, delta);      // = val - (val/scaleN)*scaleN

When val < 0 (which happens at boundary output pixels when offset is
negative), DivSIOp truncates toward zero instead of toward -∞. This produces
a negative remainder, i.e. a negative delta, which causes extrapolation.

Note: the code comment on line 2058 already says // ix = floor(x / scale_n),
but the code uses truncation — this mismatch is the root cause of the bug.

The float path (getIndexAndDeltaFp) uses FloorDivSIOp and is unaffected:
with floor division, r = val - floor(val/scaleN)*scaleN is always in
[0, scaleN-1].

Concrete Example

Setup: 2×2 input upsampled to 4×4, scale=[4,2,4,2], offset=[-1,-1]

scale_y_n=4, scale_y_d=2, scale_x_n=4, scale_x_d=2
offset_y=-1, offset_x=-1
Input: tensor<1x2x2x1xi8> with input[0,0,0,0]=100, all others 0

At output pixel (0,0):

val  = 0 * 2 + (-1) = -1

Without any fix (DivSIOp, buggy)

iy  = DivSIOp(-1, 4) = 0     // truncates -0.25 toward zero
dy  = -1 - 0*4 = -1          // OUT OF RANGE: should be in [0, 4]

y0 = clamp(0,   0, 1) = 0
y1 = clamp(0+1, 0, 1) = 1    // different rows

ix  = DivSIOp(-1, 4) = 0
dx  = -1                      // same issue

x0 = clamp(0,   0, 1) = 0
x1 = clamp(0+1, 0, 1) = 1

// pixels:
y0x0 = input[0,0,0,0] = 100
y0x1 = input[0,0,1,0] = 0
y1x0 = input[0,1,0,0] = 0
y1x1 = input[0,1,1,0] = 0

topAcc    = 100*(4-(-1)) + 0*(-1) = 500   // EXTRAPOLATION
bottomAcc = 0*(4-(-1))   + 0*(-1) = 0
result    = 500*(4-(-1)) + 0*(-1) = 2500  // WRONG

Fix w/ FloorDivSIOp

iy  = FloorDivSIOp(-1, 4) = -1   // floors -0.25 toward -∞
dy  = -1 - (-1)*4 = 3            // naturally in [0, scale_n-1]

y0 = clamp(-1,  0, 1) = 0
y1 = clamp(-1+1, 0, 1) = 0      // SAME row — both snap to boundary

dx  = 3 (same by symmetry)

// all four neighbors collapse to the same boundary pixel:
y0x0 = y0x1 = y1x0 = y1x1 = input[0,0,0,0] = 100

topAcc    = 100*(4-3) + 100*3 = 400
bottomAcc = 100*(4-3) + 100*3 = 400
result    = 400*(4-3) + 400*3 = 1600  // correct

y0=y1=0 means boundary replication is enforced by getClampedIdxs, making
dy irrelevant.

Semantic Analysis of the fix

iy=-1 correctly signals "this position is before the first input pixel."
getClampedIdxs does its intended job: both y0 and y1 snap to the
boundary pixel, enforcing replication explicitly.
dy=3 appears valid (it's in [0, scale_n-1]) but is semantically
meaningless: the true position is -0.25, which is outside the image — there
is no "3/4 toward the next pixel" to interpolate toward. It is harmless only
because y0=y1.
Same analysis for dx=3 by symmetry.
Fixes the root cause (wrong division op, matching the existing code comment
and mirroring the float path), but delta still carries a misleading value
at out-of-bounds positions.

…or index computation The integer path of tosa.resize bilinear lowering used divsi (truncation toward zero) to compute the input pixel index, but the code comment says "ix = floor(x / scale_n)" and the floating-point path already uses floordivsi. With divsi, negative dividends (caused by negative offsets) produce negative remainders, making the interpolation weights fall outside [0, scale] and causing extrapolation instead of boundary replication. Fix by replacing DivSIOp with FloorDivSIOp in getIndexAndDeltaInt. With floor division, the remainder is always in [0, scaleN), so interpolation weights are naturally non-negative and no post-hoc clamping is needed.

llvmbot · 2026-04-23T18:57:36Z

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-tosa

@llvm/pr-subscribers-mlir-linalg

Author: Henry Wu (HanchengWu)

Changes

Background

tosa.resize in bilinear integer (quantized) mode lowers to a linalg.generic
body that, for each output pixel, computes a corresponding input coordinate and
blends the four neighboring input pixels. The mapping is:

val   = out_coord * scale_d + offset
index = val / scale_n          // integer part — which input pixel to start from
delta = val - index * scale_n  // fractional part, scaled to [0, scale_n)

delta is the interpolation weight toward the next pixel. The bilinear formula
(integer path) is:

topAcc    = pixel[y0,x0] * (scale_x - dx) + pixel[y0,x1] * dx
bottomAcc = pixel[y1,x0] * (scale_x - dx) + pixel[y1,x1] * dx
result    = topAcc * (scale_y - dy) + bottomAcc * dy

For this to be a valid convex combination (interpolation, not extrapolation),
dx and dy must be in [0, scale_n].

The pixel indices y0, y1, x0, x1 are computed by getClampedIdxs:

y0 = clamp(iy,   0, H-1)
y1 = clamp(iy+1, 0, H-1)

The Bug

The integer path uses DivSIOp (truncation toward zero):

// getIndexAndDeltaInt — mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index = arith::DivSIOp::create(b, val, scaleN);   // truncates toward zero
delta = arith::MulIOp::create(b, index, scaleN);
delta = arith::SubIOp::create(b, val, delta);      // = val - (val/scaleN)*scaleN

When val < 0 (which happens at boundary output pixels when offset is
negative), DivSIOp truncates toward zero instead of toward -∞. This produces
a negative remainder, i.e. a negative delta, which causes extrapolation.

Note: the code comment on line 2058 already says // ix = floor(x / scale_n),
but the code uses truncation — this mismatch is the root cause of the bug.

The float path (getIndexAndDeltaFp) uses FloorDivSIOp and is unaffected:
with floor division, r = val - floor(val/scaleN)*scaleN is always in
[0, scaleN-1].

Concrete Example

Setup: 2×2 input upsampled to 4×4, scale=[4,2,4,2], offset=[-1,-1]

scale_y_n=4, scale_y_d=2, scale_x_n=4, scale_x_d=2
offset_y=-1, offset_x=-1
Input: tensor<1x2x2x1xi8> with input[0,0,0,0]=100, all others 0

At output pixel (0,0):

val  = 0 * 2 + (-1) = -1

Without any fix (DivSIOp, buggy)

iy  = DivSIOp(-1, 4) = 0     // truncates -0.25 toward zero
dy  = -1 - 0*4 = -1          // OUT OF RANGE: should be in [0, 4]

y0 = clamp(0,   0, 1) = 0
y1 = clamp(0+1, 0, 1) = 1    // different rows

ix  = DivSIOp(-1, 4) = 0
dx  = -1                      // same issue

x0 = clamp(0,   0, 1) = 0
x1 = clamp(0+1, 0, 1) = 1

// pixels:
y0x0 = input[0,0,0,0] = 100
y0x1 = input[0,0,1,0] = 0
y1x0 = input[0,1,0,0] = 0
y1x1 = input[0,1,1,0] = 0

topAcc    = 100*(4-(-1)) + 0*(-1) = 500   // EXTRAPOLATION
bottomAcc = 0*(4-(-1))   + 0*(-1) = 0
result    = 500*(4-(-1)) + 0*(-1) = 2500  // WRONG

Fix w/ FloorDivSIOp

iy  = FloorDivSIOp(-1, 4) = -1   // floors -0.25 toward -∞
dy  = -1 - (-1)*4 = 3            // naturally in [0, scale_n-1]

y0 = clamp(-1,  0, 1) = 0
y1 = clamp(-1+1, 0, 1) = 0      // SAME row — both snap to boundary

dx  = 3 (same by symmetry)

// all four neighbors collapse to the same boundary pixel:
y0x0 = y0x1 = y1x0 = y1x1 = input[0,0,0,0] = 100

topAcc    = 100*(4-3) + 100*3 = 400
bottomAcc = 100*(4-3) + 100*3 = 400
result    = 400*(4-3) + 400*3 = 1600  // correct

y0=y1=0 means boundary replication is enforced by getClampedIdxs, making
dy irrelevant.

Semantic Analysis of the fix

iy=-1 correctly signals "this position is before the first input pixel."
getClampedIdxs does its intended job: both y0 and y1 snap to the
boundary pixel, enforcing replication explicitly.
dy=3 appears valid (it's in [0, scale_n-1]) but is semantically
meaningless: the true position is -0.25, which is outside the image — there
is no "3/4 toward the next pixel" to interpolate toward. It is harmless only
because y0=y1.
Same analysis for dx=3 by symmetry.
Fixes the root cause (wrong division op, matching the existing code comment
and mirroring the float path), but delta still carries a misleading value
at out-of-bounds positions.

Full diff: https://github.com/llvm/llvm-project/pull/193821.diff

2 Files Affected:

(modified) mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp (+1-1)
(modified) mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir (+5-4)

diff --git a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index 11b3aabcbfeb4..e9c9e17fe6274 100644
--- a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
+++ b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
@@ -2059,7 +2059,7 @@ class GenericResizeConverter : public OpRewritePattern<tosa::ResizeOp> {
         //  dx = x - ix * scale_n;
         Value val = arith::MulIOp::create(b, in, scaleD);
         val = arith::AddIOp::create(b, val, offset);
-        index = arith::DivSIOp::create(b, val, scaleN);
+        index = arith::FloorDivSIOp::create(b, val, scaleN);
         delta = arith::MulIOp::create(b, index, scaleN);
         delta = arith::SubIOp::create(b, val, delta);
       };
diff --git a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
index 4900476b25dc5..2959cf59e953a 100644
--- a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
+++ b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
@@ -218,13 +218,13 @@ func.func @resize_nearest_int(%arg0: tensor<1x15x13x1xi8>) -> () {
 
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
   // CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
-  // CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
+  // CHECK: %[[I_Y:.*]] = arith.floordivsi %[[Y]], %[[SCALE_Y_N]]
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
   // CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]
 
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
   // CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
-  // CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
+  // CHECK: %[[I_X:.*]] = arith.floordivsi %[[X]], %[[SCALE_X_N]]
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
   // CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]
 
@@ -285,13 +285,13 @@ func.func @resize_bilinear_int(%arg0: tensor<1x19x20x1xi8>) {
 
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
   // CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
-  // CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
+  // CHECK: %[[I_Y:.*]] = arith.floordivsi %[[Y]], %[[SCALE_Y_N]]
   // CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
   // CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]
 
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
   // CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
-  // CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
+  // CHECK: %[[I_X:.*]] = arith.floordivsi %[[X]], %[[SCALE_X_N]]
   // CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
   // CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]
 
@@ -605,3 +605,4 @@ func.func @skip_interpolate_bilinear_f32(%arg0 : tensor<3x1x2x7xf32>) -> tensor<
   // CHECK:  return %[[GENERIC]]
   return %resize : tensor<3x1x4x7xf32>
 }
+

llvmbot added mlir:linalg mlir mlir:tosa labels Apr 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir][tosa] Fix integer bilinear (quantized) tosa.resize lowering to use floordivsi#193821

[mlir][tosa] Fix integer bilinear (quantized) tosa.resize lowering to use floordivsi#193821
HanchengWu wants to merge 1 commit intollvm:mainfrom
HanchengWu:upsampling2D

HanchengWu commented Apr 23, 2026

Uh oh!

llvmbot commented Apr 23, 2026 •

edited

Loading

Background

The Bug

Concrete Example

Without any fix (DivSIOp, buggy)

Fix w/ FloorDivSIOp

Semantic Analysis of the fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HanchengWu commented Apr 23, 2026

Background

The Bug

Concrete Example

Without any fix (DivSIOp, buggy)

Fix w/ FloorDivSIOp

Semantic Analysis of the fix

Uh oh!

llvmbot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

The Bug

Concrete Example

Without any fix (DivSIOp, buggy)

Fix w/ FloorDivSIOp

Semantic Analysis of the fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

llvmbot commented Apr 23, 2026 •

edited

Loading