Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@mshockwave
Copy link
Member

There are many cases where we want to run LLVM tools like opt and llc through the LLVM IR files from --iree-hal-dump-executable-intermediates-to. Instead of using this command:

$ llc -mcpu=foo -mattr='+a,+b,+c,+d,...' dump/foo.codegen.ll

Which could be annoyed especially for RISC-V because it can have a (very) long feature list. It will be nice if we could just

$ llc dump/foo.codegen.ll

This patch does this by propagating the target features and CPU name from module / TargetMachine to the target-features and target-cpu attributes of individual functions.

It is also worth noting that many LLVM frontend like Clang and Flang also do the same attribute propagation.

@mshockwave
Copy link
Member Author

cc @benvanik @kuhar @MaheshRavishankar @hanhanW as I don't have permission to add reviewers

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea! supportive of the change but we'll need to tighten up the test and it'd be very useful to factor this so we can consistently use it on all targets (GPU, etc). putting this in Dialect/HAL/Utils/LLVMCodegenUtils.h or something (next to LLVMLinkerUtils.h) would let all targets using LLVM share the same behavior.

@@ -0,0 +1,17 @@
// RUN: mkdir -p %t-dump
// RUN: iree-compile %s -output-format=vm-asm -o %t.mlir --iree-hal-target-backends=llvm-cpu -iree-llvmcpu-target-triple=riscv64-linux-gnu \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we avoid top-level invocations like this unless testing e2e - since this is something you're now doing unconditionally you can test at a tighter scope (unit test, instead of an integration test that this is). specifically, you can use iree-hal-serialize-target-executables with its dumpIntermediatesPath option to get the same behavior.

the bigger issue here is that we'd have to condition this test on the inclusion of the specific backends: riscv is optional and may not be on for everyone. generally x86-64 is available but even that is optional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bigger issue here is that we'd have to condition this test on the inclusion of the specific backends: riscv is optional and may not be on for everyone. generally x86-64 is available but even that is optional.

Is there a REQUIRES: <target name>-registered-target in IREE?
Or is there a CTest tag that we can use to filter out LLVM targets that are not built?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we avoid top-level invocations like this unless testing e2e - since this is something you're now doing unconditionally you can test at a tighter scope (unit test, instead of an integration test that this is). specifically, you can use iree-hal-serialize-target-executables with its dumpIntermediatesPath option to get the same behavior.

I just fixed the test, please take a look.

@mshockwave
Copy link
Member Author

good idea! supportive of the change but we'll need to tighten up the test and it'd be very useful to factor this so we can consistently use it on all targets (GPU, etc). putting this in Dialect/HAL/Utils/LLVMCodegenUtils.h or something (next to LLVMLinkerUtils.h) would let all targets using LLVM share the same behavior.

thank you. I have factored out the feature into a shared function and fixed the test. As you pointed out in one of the inline comment, I still don't know what's the best way to disable the test if RISCV LLVM backend is not built. Should we also port the REQUIRES mechanism from LLVM's (LIT) testsuite to here?

@mshockwave
Copy link
Member Author

ping -- any ideas on disabling the test when the LLVM is not built with a specific target?

@hanhanW
Copy link
Contributor

hanhanW commented Sep 24, 2025

ping -- any ideas on disabling the test when the LLVM is not built with a specific target?

I don't see failures on CI, what tests do you want to disable for RISC-V?

For lit tests, you can add noriscv tag. E.g.,

iree_check_single_backend_test_suite(
name = "check_index_llvm-cpu_local-task",
srcs = INDEX_SRCS,
compiler_flags = ["--iree-llvmcpu-target-cpu=generic"],
driver = "local-task",
tags = [
# indexing math generates illegal instructions for riscv
"noriscv",
],
target_backend = "llvm-cpu",
)

For other cases, we may check IREE_ARCH in cmake. I'm not familiar with setup here, so it may not work.

if(IREE_ARCH STREQUAL "riscv_64" AND

Comment on lines 16 to 19
mlir::MLIRContext *context) {
if (!context) {
context = moduleOp.getContext();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to context as a function argument? Can we just do this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, if you have an op/attr/etc then you can always get context from that - only time to have a context arg is when there's no mandatory other arg that has it (for example, an array of attrs that may be empty)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow thought the usage in LLVMCPUTarget.cpp uses two different MLIRContext, which turned out to be untrue. It's fixed now.

Comment on lines 20 to 49

llvm::MCSubtargetInfo const *subTargetInfo =
targetMachine.getMCSubtargetInfo();

const std::vector<llvm::SubtargetFeatureKV> enabledFeatures =
subTargetInfo->getEnabledProcessorFeatures();

auto plussedFeatures = llvm::to_vector(
llvm::map_range(enabledFeatures, [](llvm::SubtargetFeatureKV feature) {
return std::string("+") + feature.Key;
}));

auto plussedFeaturesRefs = llvm::to_vector(llvm::map_range(
plussedFeatures, [](auto &it) { return StringRef(it.c_str()); }));

auto fullTargetFeaturesAttr =
LLVM::TargetFeaturesAttr::get(context, plussedFeaturesRefs);

StringRef targetCPU = targetMachine.getTargetCPU();

Block &bodyBlock = moduleOp.getBodyRegion().front();
for (auto funcOp : bodyBlock.getOps<LLVM::LLVMFuncOp>()) {
if (!funcOp.getTargetFeatures().has_value()) {
funcOp.setTargetFeaturesAttr(fullTargetFeaturesAttr);
}
if (!funcOp.getTargetCpu().has_value() && !targetCPU.empty()) {
funcOp.setTargetCpuAttr(StringAttr::get(context, targetCPU));
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: I'd remove the blank lines. We don't add blank line right after each statement.

Use vertical whitespace sparingly; unnecessary blank lines make it harder to see overall code structure. Use blank lines only where they aid the reader in understanding the structure.

https://google.github.io/styleguide/cppguide.html#Vertical_Whitespace

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've removed most of the newlines in this block.

@mshockwave
Copy link
Member Author

mshockwave commented Sep 24, 2025

I don't see failures on CI, what tests do you want to disable for RISC-V?

oh my question was actually the opposite: since the target triple in the test is riscv64, it'll use LLVM RISC-V backend to generate RISC-V assembly code. But RISC-V LLVM backend might not always be built -- in fact, any of the target including X86 is optional. So I'm wondering if there is a way to enable the test only when RISC-V LLVM backend was built

@mshockwave
Copy link
Member Author

For other cases, we may check IREE_ARCH in cmake.

I think IREE_ARCH is default to the host architecture. While it is true that when the host is RISC-V, RISC-V LLVM backend is definitely enabled, we also want to run this test like we're using iree-compile to cross-compile a RISC-V binary.

@mshockwave mshockwave force-pushed the patch/iree/llvmcpu-attach-features branch from 227a2ec to db2e6bf Compare September 26, 2025 21:20
@mshockwave
Copy link
Member Author

I decided to enable the test, serialize_module.mlir, only on the matching IREE_ARCH -- it's not pretty but at least it won't break if the RISC-V LLVM backend is not built.

@ScottTodd ScottTodd removed their request for review September 26, 2025 23:20
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see something similar we did for rocm the in the AnnotateKernelForTranslation pass. https://github.com/iree-org/iree/blob/33b75b996bfa66547108768b4fe70294c5915e9c/compiler/src/iree/compiler/Codegen/LLVMGPU/ROCDLAnnotateKernelForTranslation.cpp These attributes can be set at the llvm dialect level -- it would be much easier to test this as a pass instead of a utility function inside the target code: https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/LLVMGPU/test/ROCDL/annotate_kernel_for_translation.mlir

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea, would it also work for amdgpu?

R"TXT(; To reproduce the .rocmasm from .optimized.ll, run:
; llc -mtriple={} -mcpu={} -mattr='{}' -O3 <.optimized.ll> -o <out.rocmasm>

@mshockwave
Copy link
Member Author

These attributes can be set at the llvm dialect level -- it would be much easier to test this as a pass instead of a utility function inside the target code: https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/LLVMGPU/test/ROCDL/annotate_kernel_for_translation.mlir

Good point. The annotation in this patch is also done on LLVM dialect, but I guess the problem here is that the user -- LLVMCPUTarget::serializeModule -- will generate LLVM IR and assembly code right after the module was annotated in one go.
Let me try to do the annotation a bit earlier.

@mshockwave
Copy link
Member Author

This is a good idea, would it also work for amdgpu?

R"TXT(; To reproduce the .rocmasm from .optimized.ll, run:
; llc -mtriple={} -mcpu={} -mattr='{}' -O3 <.optimized.ll> -o <out.rocmasm>

Thank you, and yes I think it works any target that lowers to LLVM.

@mshockwave
Copy link
Member Author

Alright, I've made another major revision and created a new Pass to test the newly added utility function. Since this Pass runs on LLVMIR dialect, it should be independent to the list of enabled LLVM backends. I think this is a cleaner solution.

@mshockwave mshockwave requested a review from kuhar October 2, 2025 19:47
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just some minor coding style issues

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nit

@mshockwave
Copy link
Member Author

Could you help me to launch the CI again? I pushed a new commit to address review feedback and the linter error.
Thanks

@mshockwave
Copy link
Member Author

The bazel build error is now fixed.

Rather than relying on the transive dependency from HAL::HALDialect

Signed-off-by: Min Hsu <[email protected]>
@mshockwave
Copy link
Member Author

It took me some time to fix all the bazel build failures but I believe all the CIs have passed now. I'm not sure what are those PkgCI though. Could anyone help me to merge this patch if we can skip those PkgCI?
Thanks!

@kuhar
Copy link
Member

kuhar commented Oct 6, 2025

Yeah the remaining tests are unrelated. We have one blocking review from @benvanik that prevents merging.

@mshockwave
Copy link
Member Author

ping @benvanik do you have any other comments?

@kuhar kuhar dismissed benvanik’s stale review October 14, 2025 16:20

comments addressed

@kuhar kuhar merged commit 71b4396 into iree-org:main Oct 14, 2025
59 of 69 checks passed
@kuhar
Copy link
Member

kuhar commented Oct 14, 2025

I clicked the merge button because of it's been approved for a week, but if Ben comes back with come comments we can fix forward.

@mshockwave
Copy link
Member Author

I clicked the merge button because of it's been approved for a week, but if Ben comes back with come comments we can fix forward.

thank you

weidel-p pushed a commit to weidel-p/iree that referenced this pull request Oct 21, 2025
…ncOp (iree-org#22036)

There are many cases where we want to run LLVM tools like `opt` and
`llc` through the LLVM IR files from
`--iree-hal-dump-executable-intermediates-to`. Instead of using this
command:
```
$ llc -mcpu=foo -mattr='+a,+b,+c,+d,...' dump/foo.codegen.ll
```
Which could be annoyed especially for RISC-V because it can have a
(very) long feature list. It will be nice if we could just
```
$ llc dump/foo.codegen.ll
```
This patch does this by propagating the target features and CPU name
from module / TargetMachine to the `target-features` and `target-cpu`
attributes of individual functions.

It is also worth noting that many LLVM frontend like Clang and Flang
also do the same attribute propagation.

---------

Signed-off-by: Min Hsu <[email protected]>
Co-authored-by: Jakub Kuderski <[email protected]>
Signed-off-by: Philipp <[email protected]>
benvanik pushed a commit that referenced this pull request Oct 30, 2025
…l LLVMFuncOp" (#22488)

Reverts #22036

The commit seems to be the root cause of the illegal CPU instructions.
We produce different codegen artifact with and without the change.

To repro:

```
iree-compile ~/repro.mlir -o /tmp/z.vmfb --iree-hal-dump-executable-binaries-to=/tmp/z`
md5sum /tmp/z/module_jit_eval_dispatch_0_embedded_elf_x86_64.so
```

`repro.mlir`:
https://gist.github.com/hanhanW/43ffdd9a84144ec4fc8e4b5a5b450f7e

Without the change: the md5sum is `d5462d72f7ddc97c1a1c28e65cf614ff`.

With the change, the md5sum is `8edeadc38d332b514bd36102186a5b2a`.

The author agreed on reverting the change; will help investigate it
later:
https://discord.com/channels/689900678990135345/689957613152239638/1433570268307132589
bangtianliu pushed a commit to bangtianliu/iree that referenced this pull request Nov 19, 2025
…l LLVMFuncOp" (iree-org#22488)

Reverts iree-org#22036

The commit seems to be the root cause of the illegal CPU instructions.
We produce different codegen artifact with and without the change.

To repro:

```
iree-compile ~/repro.mlir -o /tmp/z.vmfb --iree-hal-dump-executable-binaries-to=/tmp/z`
md5sum /tmp/z/module_jit_eval_dispatch_0_embedded_elf_x86_64.so
```

`repro.mlir`:
https://gist.github.com/hanhanW/43ffdd9a84144ec4fc8e4b5a5b450f7e

Without the change: the md5sum is `d5462d72f7ddc97c1a1c28e65cf614ff`.

With the change, the md5sum is `8edeadc38d332b514bd36102186a5b2a`.

The author agreed on reverting the change; will help investigate it
later:
https://discord.com/channels/689900678990135345/689957613152239638/1433570268307132589
pstarkcdpr pushed a commit to pstarkcdpr/iree that referenced this pull request Nov 28, 2025
…ncOp (iree-org#22036)

There are many cases where we want to run LLVM tools like `opt` and
`llc` through the LLVM IR files from
`--iree-hal-dump-executable-intermediates-to`. Instead of using this
command:
```
$ llc -mcpu=foo -mattr='+a,+b,+c,+d,...' dump/foo.codegen.ll
```
Which could be annoyed especially for RISC-V because it can have a
(very) long feature list. It will be nice if we could just
```
$ llc dump/foo.codegen.ll
```
This patch does this by propagating the target features and CPU name
from module / TargetMachine to the `target-features` and `target-cpu`
attributes of individual functions.

It is also worth noting that many LLVM frontend like Clang and Flang
also do the same attribute propagation.

---------

Signed-off-by: Min Hsu <[email protected]>
Co-authored-by: Jakub Kuderski <[email protected]>
pstarkcdpr pushed a commit to pstarkcdpr/iree that referenced this pull request Nov 28, 2025
…l LLVMFuncOp" (iree-org#22488)

Reverts iree-org#22036

The commit seems to be the root cause of the illegal CPU instructions.
We produce different codegen artifact with and without the change.

To repro:

```
iree-compile ~/repro.mlir -o /tmp/z.vmfb --iree-hal-dump-executable-binaries-to=/tmp/z`
md5sum /tmp/z/module_jit_eval_dispatch_0_embedded_elf_x86_64.so
```

`repro.mlir`:
https://gist.github.com/hanhanW/43ffdd9a84144ec4fc8e4b5a5b450f7e

Without the change: the md5sum is `d5462d72f7ddc97c1a1c28e65cf614ff`.

With the change, the md5sum is `8edeadc38d332b514bd36102186a5b2a`.

The author agreed on reverting the change; will help investigate it
later:
https://discord.com/channels/689900678990135345/689957613152239638/1433570268307132589
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants