Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fix a few typos in SPIRV translation comments.#4

Closed
ScottTodd wants to merge 1 commit intomasterfrom
typo-fix
Closed

Fix a few typos in SPIRV translation comments.#4
ScottTodd wants to merge 1 commit intomasterfrom
typo-fix

Conversation

@ScottTodd
Copy link
Member

No description provided.

@ScottTodd ScottTodd deleted the typo-fix branch September 25, 2019 22:22
This was referenced Dec 16, 2020
@not-jenni not-jenni mentioned this pull request Feb 4, 2021
@GMNGeoffrey GMNGeoffrey mentioned this pull request Feb 5, 2021
erman-gurses pushed a commit to erman-gurses/iree that referenced this pull request May 6, 2022
-Adding tuned apple M1 Config for spirv kernels

Co-authored-by: nodlabs <[email protected]>
@iree-github-actions-bot
Copy link
Contributor

iree-github-actions-bot commented Feb 9, 2023

[misconfigured testing infra]

qedawkins pushed a commit to qedawkins/iree that referenced this pull request Feb 10, 2023
qcolombet added a commit to qcolombet/iree that referenced this pull request Mar 15, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix.
Plumb the affine.apply decomposition through a new pass: decompose-affine-ops.
Rework the lowering pipeline to connect all the piece together:
1. extract-address-computation turns address computation into subviews
2. expand-strided-metadata turns subviews into affine.apply
3. licm hoists the code introduced by iree-org#2 in the right scf.for loop
4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting
   subexpressions can be hoisted in the right loops.
5. licm hoists the code introduced by iree-org#4
6. lower-affine materializes the decomposed `affine.apply`s. We do that early
   to avoid the canonicalization to undo this work.

Phase 3-5 needs to run on `scf.for`, so the whole process has to run before
scf to cf.

Missing bits:
- More comments
- Add tests
- Fix the subviews sizes for non-unary loads (although it doesn't break
  anything this is technically incorrect.)
- LLVM reassociate undo some of the thing we improve here. Need to file a bug
  for that, investigate and fix.

Note: extract-address-computation could be moved to LLVM open source,
but we need to figure out where it could live since it has both a dependency on
memref and nvgpu. We probably want to come up with an interface like
`isAddressComputationExtractable` to push it upstream.
qcolombet added a commit to qcolombet/iree that referenced this pull request Mar 17, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix.
Plumb the affine.apply decomposition through a new pass: decompose-affine-ops.
Rework the lowering pipeline to connect all the piece together:
1. extract-address-computation turns address computation into subviews
2. expand-strided-metadata turns subviews into affine.apply
3. licm hoists the code introduced by iree-org#2 in the right scf.for loop
4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting
   subexpressions can be hoisted in the right loops.
5. licm hoists the code introduced by iree-org#4
6. lower-affine materializes the decomposed `affine.apply`s. We do that early
   to avoid the canonicalization to undo this work.

Phase 3-5 needs to run on `scf.for`, so the whole process has to run before
scf to cf.

Missing bits:
- More comments
- Add tests
- Fix the subviews sizes for non-unary loads (although it doesn't break
  anything this is technically incorrect.)
- LLVM reassociate undo some of the thing we improve here. Need to file a bug
  for that, investigate and fix.

Note: extract-address-computation could be moved to LLVM open source,
but we need to figure out where it could live since it has both a dependency on
memref and nvgpu. We probably want to come up with an interface like
`isAddressComputationExtractable` to push it upstream.
qcolombet added a commit to qcolombet/iree that referenced this pull request Mar 21, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix.
Plumb the affine.apply decomposition through a new pass: decompose-affine-ops.
Rework the lowering pipeline to connect all the piece together:
1. extract-address-computation turns address computation into subviews
2. expand-strided-metadata turns subviews into affine.apply
3. licm hoists the code introduced by iree-org#2 in the right scf.for loop
4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting
   subexpressions can be hoisted in the right loops.
5. licm hoists the code introduced by iree-org#4
6. lower-affine materializes the decomposed `affine.apply`s. We do that early
   to avoid the canonicalization to undo this work.

Phase 3-5 needs to run on `scf.for`, so the whole process has to run before
scf to cf.

TODO:
- Add support for memref.store, vector.transfer_xxx

Note: extract-address-computation could be moved to LLVM open source,
but we need to figure out where it could live since it has both a dependency on
memref and nvgpu. We probably want to come up with an interface like
`isAddressComputationExtractable` to push it upstream.
qcolombet added a commit to qcolombet/iree that referenced this pull request Mar 24, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix.
Plumb the affine.apply decomposition through a new pass: decompose-affine-ops.
Rework the lowering pipeline to connect all the piece together:
1. extract-address-computation turns address computation into subviews
2. expand-strided-metadata turns subviews into affine.apply
3. licm hoists the code introduced by iree-org#2 in the right scf.for loop
4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting
   subexpressions can be hoisted in the right loops.
5. licm hoists the code introduced by iree-org#4
6. lower-affine materializes the decomposed `affine.apply`s. We do that early
   to avoid the canonicalization to undo this work.

Phase 3-5 needs to run on `scf.for`, so the whole process has to run before
scf to cf.

TODO:
- Add support for memref.store, vector.transfer_xxx

Note: extract-address-computation could be moved to LLVM open source,
but we need to figure out where it could live since it has both a dependency on
memref and nvgpu. We probably want to come up with an interface like
`isAddressComputationExtractable` to push it upstream.
qcolombet added a commit to qcolombet/iree that referenced this pull request Mar 24, 2023
Add a pass to extract address computation from memref.load and nvgpu.ldmatrix.
Plumb the affine.apply decomposition through a new pass: decompose-affine-ops.
Rework the lowering pipeline to connect all the piece together:
1. extract-address-computation turns address computation into subviews
2. expand-strided-metadata turns subviews into affine.apply
3. licm hoists the code introduced by iree-org#2 in the right scf.for loop
4. decompose-affine-ops breaks down the `affine.apply`s so that the resulting
   subexpressions can be hoisted in the right loops.
5. licm hoists the code introduced by iree-org#4
6. lower-affine materializes the decomposed `affine.apply`s. We do that early
   to avoid the canonicalization to undo this work.

Phase 3-5 needs to run on `scf.for`, so the whole process has to run before
scf to cf.

TODO:
- Add support for vector.transfer_xxx

Note: extract-address-computation could be moved to LLVM open source,
but we need to figure out where it could live since it has both a dependency on
memref and nvgpu. We probably want to come up with an interface like
`isAddressComputationExtractable` to push it upstream.
ScottTodd added a commit that referenced this pull request Aug 22, 2023
Caught by ASan:

```
370: =================================================================
370: ==3911909==ERROR: LeakSanitizer: detected memory leaks
370: 
370: Direct leak of 376 byte(s) in 1 object(s) allocated from:
370:     #0 0x6a9b022 in calloc (iree-build/tools/iree-run-mlir+0x6a9b022)
370:     #1 0x6ad5d47 in iree_allocator_system_alloc iree/runtime/src/iree/base/allocator.c:104:17
370:     #2 0x6ad5d47 in iree_allocator_system_ctl iree/runtime/src/iree/base/allocator.c:144:14
370:     #3 0x6ad56ad in iree_allocator_issue_alloc iree/runtime/src/iree/base/allocator.c:27:10
370:     #4 0x6ad56ad in iree_allocator_malloc iree/runtime/src/iree/base/allocator.c:32:10
370:     #5 0x1acf2486 in iree_vm_bytecode_module_create iree/runtime/src/iree/vm/bytecode/module.c:836:3
370:     #6 0x6afdf31 in iree_tooling_create_run_context iree/runtime/src/iree/tooling/run_module.c:107:9
370:     #7 0x6afdf31 in iree_tooling_run_module_with_data iree/runtime/src/iree/tooling/run_module.c:340:3
370:     #8 0x6ad2a24 in iree::(anonymous namespace)::CompileAndRunFile(iree_compiler_session_t*, char const*) iree/tools/iree-run-mlir-main.cc:359:3
370:     #9 0x6ad2a24 in main iree/tools/iree-run-mlir-main.cc:520:20
370:     #10 0x7fce3bc456c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```
stellaraccident pushed a commit that referenced this pull request Sep 24, 2023
* Presently schedules for 7 hours after IREE's nightly release is cut
(which should be ample time to build).
egebeysel pushed a commit to egebeysel/iree that referenced this pull request Jul 1, 2025
One that seems to get on average better results with resnet
ziereis added a commit to ziereis/iree that referenced this pull request Aug 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants