Conversation
This was referenced Dec 16, 2020
Closed
Merged
Merged
Closed
erman-gurses
pushed a commit
to erman-gurses/iree
that referenced
this pull request
May 6, 2022
-Adding tuned apple M1 Config for spirv kernels Co-authored-by: nodlabs <[email protected]>
Contributor
|
[misconfigured testing infra] |
qedawkins
pushed a commit
to qedawkins/iree
that referenced
this pull request
Feb 10, 2023
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 15, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. Missing bits: - More comments - Add tests - Fix the subviews sizes for non-unary loads (although it doesn't break anything this is technically incorrect.) - LLVM reassociate undo some of the thing we improve here. Need to file a bug for that, investigate and fix. Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 17, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. Missing bits: - More comments - Add tests - Fix the subviews sizes for non-unary loads (although it doesn't break anything this is technically incorrect.) - LLVM reassociate undo some of the thing we improve here. Need to file a bug for that, investigate and fix. Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 21, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. TODO: - Add support for memref.store, vector.transfer_xxx Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 24, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. TODO: - Add support for memref.store, vector.transfer_xxx Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
qcolombet
added a commit
to qcolombet/iree
that referenced
this pull request
Mar 24, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix. Plumb the affine.apply decomposition through a new pass: decompose-affine-ops. Rework the lowering pipeline to connect all the piece together: 1. extract-address-computation turns address computation into subviews 2. expand-strided-metadata turns subviews into affine.apply 3. licm hoists the code introduced by iree-org#2 in the right scf.for loop 4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting subexpressions can be hoisted in the right loops. 5. licm hoists the code introduced by iree-org#4 6. lower-affine materializes the decomposed `affine.apply`s. We do that early to avoid the canonicalization to undo this work. Phase 3-5 needs to run on `scf.for`, so the whole process has to run before scf to cf. TODO: - Add support for vector.transfer_xxx Note: extract-address-computation could be moved to LLVM open source, but we need to figure out where it could live since it has both a dependency on memref and nvgpu. We probably want to come up with an interface like `isAddressComputationExtractable` to push it upstream.
ScottTodd
added a commit
that referenced
this pull request
Aug 22, 2023
Caught by ASan: ``` 370: ================================================================= 370: ==3911909==ERROR: LeakSanitizer: detected memory leaks 370: 370: Direct leak of 376 byte(s) in 1 object(s) allocated from: 370: #0 0x6a9b022 in calloc (iree-build/tools/iree-run-mlir+0x6a9b022) 370: #1 0x6ad5d47 in iree_allocator_system_alloc iree/runtime/src/iree/base/allocator.c:104:17 370: #2 0x6ad5d47 in iree_allocator_system_ctl iree/runtime/src/iree/base/allocator.c:144:14 370: #3 0x6ad56ad in iree_allocator_issue_alloc iree/runtime/src/iree/base/allocator.c:27:10 370: #4 0x6ad56ad in iree_allocator_malloc iree/runtime/src/iree/base/allocator.c:32:10 370: #5 0x1acf2486 in iree_vm_bytecode_module_create iree/runtime/src/iree/vm/bytecode/module.c:836:3 370: #6 0x6afdf31 in iree_tooling_create_run_context iree/runtime/src/iree/tooling/run_module.c:107:9 370: #7 0x6afdf31 in iree_tooling_run_module_with_data iree/runtime/src/iree/tooling/run_module.c:340:3 370: #8 0x6ad2a24 in iree::(anonymous namespace)::CompileAndRunFile(iree_compiler_session_t*, char const*) iree/tools/iree-run-mlir-main.cc:359:3 370: #9 0x6ad2a24 in main iree/tools/iree-run-mlir-main.cc:520:20 370: #10 0x7fce3bc456c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 ```
stellaraccident
pushed a commit
that referenced
this pull request
Sep 24, 2023
* Presently schedules for 7 hours after IREE's nightly release is cut (which should be ample time to build).
This was referenced Apr 1, 2025
This was referenced Apr 9, 2025
egebeysel
pushed a commit
to egebeysel/iree
that referenced
this pull request
Jul 1, 2025
One that seems to get on average better results with resnet
ziereis
added a commit
to ziereis/iree
that referenced
this pull request
Aug 5, 2025
4 tasks
lkeller-synaptics
referenced
this pull request
in synaptics-torq/iree
Oct 22, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.