Codestin Search App

iree-3.10.0rc20260131

[LinalgExt] Add OuterReduction tiling strategy for ArgCompareOp (iree…

…-org#23102)

This PR extends ArgCompareOp's PartialReductionOpInterface to support
the OuterReduction tiling strategy in addition to the existing
OuterParallel (Split-Reduction) strategy.

## Example
For `arg_compare` on `tensor<64x4096xf32>` with reduction `dim=1` and
`tile_size=128`:

**OuterParallel** (existing) - each chunk writes to a separate slot:
```mlir
// Partial results: tensor<64x32xf32>, tensor<64x32xi32> (32 chunks)
%results:2 = scf.forall (%chunk_idx) = (0) to (4096) step (128)
    shared_outs(%val = %init_val, %idx = %init_idx) {
  %slice = tensor.extract_slice %input[0, %chunk_idx] [64, 128] [1, 1]
  %partial:2 = iree_linalg_ext.arg_compare dim(1) ins(%slice) outs(...)
  scf.forall.in_parallel {
    tensor.parallel_insert_slice %partial#0 into %val[0, %chunk_idx] [64, 1]
    tensor.parallel_insert_slice %partial#1 into %idx[0, %chunk_idx] [64, 1]
  }
}
%final:2 = linalg.reduce ins(%results#0, %results#1) dims=[1]
```
`OuterReduction `(this PR) - accumulates in place each iteration:
```mlir
// Partial results: tensor<64x128xf32>, tensor<64x128xi32> (tile shape)
%results:2 = scf.for %iv = 0 to 4096 step 128
    iter_args(%val = %init_val, %idx = %init_idx) {
  %slice = tensor.extract_slice %input[0, %iv] [64, 128] [1, 1]
  %updated:2 = linalg.generic ins(%slice) outs(%val, %idx) {
    ^bb0(%new: f32, %acc_val: f32, %acc_idx: i32):
      %global_idx = arith.addi %iv, %local_idx  // track position
      %cmp = arith.cmpf ogt, %new, %acc_val
      %sel_val = arith.select %cmp, %new, %acc_val
      %sel_idx = arith.select %cmp, %global_idx, %acc_idx
      linalg.yield %sel_val, %sel_idx
  }
  scf.yield %updated#0, %updated#1
}
%final:2 = linalg.reduce ins(%results#0, %results#1) dims=[1]
```

This is one necessary step for plumbing through ArgCompare along
VectorDistribute pipeline.

Issue: iree-org#23005

---------

Signed-off-by: Bangtian Liu <[email protected]>

Jan 31, 2026
0d71d25
zip
tar.gz

iree-3.10.0rc20260130

iree-bazel-* improvements for handling multiple targets + options. (i…

…ree-org#23330)

iree-bazel-try:
- supports --features, useful for --features=thin_lto
- workaround for thin_lto + Wno-unused-command-line-argument
- --copt and --linkopt in a way that ensures the entire build is
  configured with the options (prior only the try target was, which was
  useful in isolated testing but not when benchmarking/etc)
- uses a new output base (so features/copt/linkopt don't pollute the
  normal build, good for concurrent try + build/test)
- fixed caching of files passed in by path
- fixed files passed in by path to have original source locations

iree-bazel-test/build/fuzz:
- support multiple targets (`iree-bazel-test //:a //:b`)
- this allows multiple fuzzers to run in batched mode

iree-bazel-cquery:
- added to match iree-bazel-query so we have the pair

misc:
- `target_compatible_with`/platform `select` support in bazel-to-cmake
- fixing benchmark warnings about missing unit
- fixed a bug in cc_benchmark dropping extra args on benchmark tests

---------

Co-authored-by: Claude <[email protected]>

Jan 30, 2026
3824de7
zip
tar.gz

iree-3.10.0rc20260129

[GPU] MmaSchedule configuration crashes when lacking PerfTflops (iree…

…-org#23303)

getPerfTflops may return a null dictionary. In these cases we should
treat it as empty.

Signed-off-by: Rob Suderman <[email protected]>

Jan 29, 2026
e968799
zip
tar.gz

iree-3.10.0rc20260128

[Codegen] Use safer hoisting in OptimizeTensorInsertExtractSlices (ir…

…ee-org#23280)

Use the `moveLoopInvariantCodeFromGuaranteedLoops` transform instead of
the `moveLoopInvariantCode` transform in the
OptimizeTensorInsertExtractSlices pass. This transform is safer, because
it validates that loops will be executed at least once before hoisting
loop invariant code. Hoisting from loops that may not execute is not an
optimization, so this is a better version of the transformation.

The new safer transform also hoists from linalg.generic ops, so the
`moveLoopInvariantCodeFromGenericOps` is removed, since it is no longer
used.

This PR also removes the
`_batch_matmul_narrow_n_2_dispatch_4_unpack_i32` test, which was doing
nothing but checking that a tensor.empty op gets hoisted from an scf.for
loop (which cannot be guaranteed to execute). Hoisting empty tensors is
not the job of this pass, and the test is verbose, so the test is simply
removed.

Signed-off-by: Max Dawkins <[email protected]>

Jan 27, 2026
789859e
zip
tar.gz

iree-3.10.0rc20260127

[e2e] Increase test timeout for gfx1250 (iree-org#23286)

Expose timeout as an optional argument in `iree_native_test` in matmul
tests. Regression tests already know how to translate the bazel timeout
parameter to seconds.

Assisted-by: claude

Jan 26, 2026
a413305
zip
tar.gz

iree-3.10.0rc20260126

[NFC] Make status test macros take ownership of iree_status_t. (iree-…

…org#23276)

Adds `ConsumeForTest` overloads that wrap raw `iree_status_t` in
`iree::Status` RAII wrappers, ensuring automatic cleanup on test
failure. The macros `IREE_EXPECT_OK`, `IREE_ASSERT_OK`,
`IREE_EXPECT_STATUS_IS`, and `IREE_ASSERT_STATUS_IS` now consume the
status they test.

For lvalue status variables, the source is cleared to a code-only value
so any existing `iree_status_ignore`/`iree_status_free` calls become
harmless no-ops. This allows incremental migration without breaking
existing tests.

Most tests (outside of tokenizer) have been updated. tokenizer is being
reworked and the next feature branch merge will adopt this behavior.

---------

Co-authored-by: Claude <[email protected]>

Jan 25, 2026
1d89835
zip
tar.gz

iree-3.10.0rc20260125

[CPU][NFC] Fix incorrect mmt4d dimension names in comments. (iree-org…

…#23234)

The comments in KernelDispatch.cpp had the mmt4d dimension naming
backwards. The six dimensions are M1, N1, K1, M0, N0, K0.

- Result shape: BxM0xN0xM1xN1 → BxM1xN1xM0xN0
- getMmt4dInnerTileSizes returns M0/N0, not M1/N1
- Iteration domain: m0, n0, k0, m1, n1, k1 → M1, N1, K1, M0, N0, K0

Signed-off-by: hanhanW <[email protected]>

Jan 24, 2026
1a912be
zip
tar.gz

iree-3.10.0rc20260124

[CPU][NFC] Fix incorrect mmt4d dimension names in comments. (iree-org…

…#23234)

The comments in KernelDispatch.cpp had the mmt4d dimension naming
backwards. The six dimensions are M1, N1, K1, M0, N0, K0.

- Result shape: BxM0xN0xM1xN1 → BxM1xN1xM0xN0
- getMmt4dInnerTileSizes returns M0/N0, not M1/N1
- Iteration domain: m0, n0, k0, m1, n1, k1 → M1, N1, K1, M0, N0, K0

Signed-off-by: hanhanW <[email protected]>

Jan 24, 2026
1a912be
zip
tar.gz

iree-3.10.0rc20260123

Integrate LLVM@5c35af8f1e6ebc7c32 (iree-org#23252)

Reverts carried forward:
* Local revert of llvm/llvm-project#169614 due
to iree-org#22649

Other changes:
* Fixes lit tests to account for
llvm/llvm-project#174452

Jan 22, 2026
818f45f
zip
tar.gz

iree-3.10.0rc20260122

Reapply "LLVM Integrate@6cc18a8e4338 (iree-org#23226)" (iree-org#23236)

This reverts commit 8ca6c8f13398c5bbe961e9bc874d6b3de398e5e8.

Also uses `visitNonControlFlowArguments` new API since
llvm/llvm-project#175815

Jan 22, 2026
5aa6453
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iree-3.10.0rc20260131

iree-3.10.0rc20260130

iree-3.10.0rc20260129

iree-3.10.0rc20260128

iree-3.10.0rc20260127

iree-3.10.0rc20260126

iree-3.10.0rc20260125

iree-3.10.0rc20260124

iree-3.10.0rc20260123

iree-3.10.0rc20260122

Tags: xgupta/iree