Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: xgupta/iree

Tags

iree-3.10.0rc20260131

Toggle iree-3.10.0rc20260131's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[LinalgExt] Add OuterReduction tiling strategy for ArgCompareOp (iree…

…-org#23102)

This PR extends ArgCompareOp's PartialReductionOpInterface to support
the OuterReduction tiling strategy in addition to the existing
OuterParallel (Split-Reduction) strategy.

## Example
For `arg_compare` on `tensor<64x4096xf32>` with reduction `dim=1` and
`tile_size=128`:

**OuterParallel** (existing) - each chunk writes to a separate slot:
```mlir
// Partial results: tensor<64x32xf32>, tensor<64x32xi32> (32 chunks)
%results:2 = scf.forall (%chunk_idx) = (0) to (4096) step (128)
    shared_outs(%val = %init_val, %idx = %init_idx) {
  %slice = tensor.extract_slice %input[0, %chunk_idx] [64, 128] [1, 1]
  %partial:2 = iree_linalg_ext.arg_compare dim(1) ins(%slice) outs(...)
  scf.forall.in_parallel {
    tensor.parallel_insert_slice %partial#0 into %val[0, %chunk_idx] [64, 1]
    tensor.parallel_insert_slice %partial#1 into %idx[0, %chunk_idx] [64, 1]
  }
}
%final:2 = linalg.reduce ins(%results#0, %results#1) dims=[1]
```
`OuterReduction `(this PR) - accumulates in place each iteration:
```mlir
// Partial results: tensor<64x128xf32>, tensor<64x128xi32> (tile shape)
%results:2 = scf.for %iv = 0 to 4096 step 128
    iter_args(%val = %init_val, %idx = %init_idx) {
  %slice = tensor.extract_slice %input[0, %iv] [64, 128] [1, 1]
  %updated:2 = linalg.generic ins(%slice) outs(%val, %idx) {
    ^bb0(%new: f32, %acc_val: f32, %acc_idx: i32):
      %global_idx = arith.addi %iv, %local_idx  // track position
      %cmp = arith.cmpf ogt, %new, %acc_val
      %sel_val = arith.select %cmp, %new, %acc_val
      %sel_idx = arith.select %cmp, %global_idx, %acc_idx
      linalg.yield %sel_val, %sel_idx
  }
  scf.yield %updated#0, %updated#1
}
%final:2 = linalg.reduce ins(%results#0, %results#1) dims=[1]
```

This is one necessary step for plumbing through ArgCompare along
VectorDistribute pipeline.

Issue: iree-org#23005

---------

Signed-off-by: Bangtian Liu <[email protected]>

iree-3.10.0rc20260130

Toggle iree-3.10.0rc20260130's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
iree-bazel-* improvements for handling multiple targets + options. (i…

…ree-org#23330)

iree-bazel-try:
- supports --features, useful for --features=thin_lto
- workaround for thin_lto + Wno-unused-command-line-argument
- --copt and --linkopt in a way that ensures the entire build is
  configured with the options (prior only the try target was, which was
  useful in isolated testing but not when benchmarking/etc)
- uses a new output base (so features/copt/linkopt don't pollute the
  normal build, good for concurrent try + build/test)
- fixed caching of files passed in by path
- fixed files passed in by path to have original source locations

iree-bazel-test/build/fuzz:
- support multiple targets (`iree-bazel-test //:a //:b`)
- this allows multiple fuzzers to run in batched mode

iree-bazel-cquery:
- added to match iree-bazel-query so we have the pair

misc:
- `target_compatible_with`/platform `select` support in bazel-to-cmake
- fixing benchmark warnings about missing unit
- fixed a bug in cc_benchmark dropping extra args on benchmark tests

---------

Co-authored-by: Claude <[email protected]>

iree-3.10.0rc20260129

Toggle iree-3.10.0rc20260129's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[GPU] MmaSchedule configuration crashes when lacking PerfTflops (iree…

…-org#23303)

getPerfTflops may return a null dictionary. In these cases we should
treat it as empty.

Signed-off-by: Rob Suderman <[email protected]>

iree-3.10.0rc20260128

Toggle iree-3.10.0rc20260128's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[Codegen] Use safer hoisting in OptimizeTensorInsertExtractSlices (ir…

…ee-org#23280)

Use the `moveLoopInvariantCodeFromGuaranteedLoops` transform instead of
the `moveLoopInvariantCode` transform in the
OptimizeTensorInsertExtractSlices pass. This transform is safer, because
it validates that loops will be executed at least once before hoisting
loop invariant code. Hoisting from loops that may not execute is not an
optimization, so this is a better version of the transformation.

The new safer transform also hoists from linalg.generic ops, so the
`moveLoopInvariantCodeFromGenericOps` is removed, since it is no longer
used.

This PR also removes the
`_batch_matmul_narrow_n_2_dispatch_4_unpack_i32` test, which was doing
nothing but checking that a tensor.empty op gets hoisted from an scf.for
loop (which cannot be guaranteed to execute). Hoisting empty tensors is
not the job of this pass, and the test is verbose, so the test is simply
removed.

Signed-off-by: Max Dawkins <[email protected]>

iree-3.10.0rc20260127

Toggle iree-3.10.0rc20260127's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[e2e] Increase test timeout for gfx1250 (iree-org#23286)

Expose timeout as an optional argument in `iree_native_test` in matmul
tests. Regression tests already know how to translate the bazel timeout
parameter to seconds.

Assisted-by: claude

iree-3.10.0rc20260126

Toggle iree-3.10.0rc20260126's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[NFC] Make status test macros take ownership of iree_status_t. (iree-…

…org#23276)

Adds `ConsumeForTest` overloads that wrap raw `iree_status_t` in
`iree::Status` RAII wrappers, ensuring automatic cleanup on test
failure. The macros `IREE_EXPECT_OK`, `IREE_ASSERT_OK`,
`IREE_EXPECT_STATUS_IS`, and `IREE_ASSERT_STATUS_IS` now consume the
status they test.

For lvalue status variables, the source is cleared to a code-only value
so any existing `iree_status_ignore`/`iree_status_free` calls become
harmless no-ops. This allows incremental migration without breaking
existing tests.

Most tests (outside of tokenizer) have been updated. tokenizer is being
reworked and the next feature branch merge will adopt this behavior.

---------

Co-authored-by: Claude <[email protected]>

iree-3.10.0rc20260125

Toggle iree-3.10.0rc20260125's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[CPU][NFC] Fix incorrect mmt4d dimension names in comments. (iree-org…

…#23234)

The comments in KernelDispatch.cpp had the mmt4d dimension naming
backwards. The six dimensions are M1, N1, K1, M0, N0, K0.

- Result shape: BxM0xN0xM1xN1 → BxM1xN1xM0xN0
- getMmt4dInnerTileSizes returns M0/N0, not M1/N1
- Iteration domain: m0, n0, k0, m1, n1, k1 → M1, N1, K1, M0, N0, K0

Signed-off-by: hanhanW <[email protected]>

iree-3.10.0rc20260124

Toggle iree-3.10.0rc20260124's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[CPU][NFC] Fix incorrect mmt4d dimension names in comments. (iree-org…

…#23234)

The comments in KernelDispatch.cpp had the mmt4d dimension naming
backwards. The six dimensions are M1, N1, K1, M0, N0, K0.

- Result shape: BxM0xN0xM1xN1 → BxM1xN1xM0xN0
- getMmt4dInnerTileSizes returns M0/N0, not M1/N1
- Iteration domain: m0, n0, k0, m1, n1, k1 → M1, N1, K1, M0, N0, K0

Signed-off-by: hanhanW <[email protected]>

iree-3.10.0rc20260123

Toggle iree-3.10.0rc20260123's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Integrate LLVM@5c35af8f1e6ebc7c32 (iree-org#23252)

Reverts carried forward:
* Local revert of llvm/llvm-project#169614 due
to iree-org#22649

Other changes:
* Fixes lit tests to account for
llvm/llvm-project#174452

iree-3.10.0rc20260122

Toggle iree-3.10.0rc20260122's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Reapply "LLVM Integrate@6cc18a8e4338 (iree-org#23226)" (iree-org#23236)

This reverts commit 8ca6c8f13398c5bbe961e9bc874d6b3de398e5e8.

Also uses `visitNonControlFlowArguments` new API since
llvm/llvm-project#175815