-
Notifications
You must be signed in to change notification settings - Fork 825
[LLVMCPU] Propagate target features and CPU name to individual LLVMFuncOp #22036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVMCPU] Propagate target features and CPU name to individual LLVMFuncOp #22036
Conversation
|
cc @benvanik @kuhar @MaheshRavishankar @hanhanW as I don't have permission to add reviewers |
benvanik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea! supportive of the change but we'll need to tighten up the test and it'd be very useful to factor this so we can consistently use it on all targets (GPU, etc). putting this in Dialect/HAL/Utils/LLVMCodegenUtils.h or something (next to LLVMLinkerUtils.h) would let all targets using LLVM share the same behavior.
| @@ -0,0 +1,17 @@ | |||
| // RUN: mkdir -p %t-dump | |||
| // RUN: iree-compile %s -output-format=vm-asm -o %t.mlir --iree-hal-target-backends=llvm-cpu -iree-llvmcpu-target-triple=riscv64-linux-gnu \ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we avoid top-level invocations like this unless testing e2e - since this is something you're now doing unconditionally you can test at a tighter scope (unit test, instead of an integration test that this is). specifically, you can use iree-hal-serialize-target-executables with its dumpIntermediatesPath option to get the same behavior.
the bigger issue here is that we'd have to condition this test on the inclusion of the specific backends: riscv is optional and may not be on for everyone. generally x86-64 is available but even that is optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the bigger issue here is that we'd have to condition this test on the inclusion of the specific backends: riscv is optional and may not be on for everyone. generally x86-64 is available but even that is optional.
Is there a REQUIRES: <target name>-registered-target in IREE?
Or is there a CTest tag that we can use to filter out LLVM targets that are not built?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we avoid top-level invocations like this unless testing e2e - since this is something you're now doing unconditionally you can test at a tighter scope (unit test, instead of an integration test that this is). specifically, you can use iree-hal-serialize-target-executables with its dumpIntermediatesPath option to get the same behavior.
I just fixed the test, please take a look.
thank you. I have factored out the feature into a shared function and fixed the test. As you pointed out in one of the inline comment, I still don't know what's the best way to disable the test if RISCV LLVM backend is not built. Should we also port the |
|
ping -- any ideas on disabling the test when the LLVM is not built with a specific target? |
I don't see failures on CI, what tests do you want to disable for RISC-V? For lit tests, you can add iree/tests/e2e/linalg/BUILD.bazel Lines 251 to 261 in 6c30926
For other cases, we may check IREE_ARCH in cmake. I'm not familiar with setup here, so it may not work. iree/build_tools/cmake/iree_macros.cmake Line 682 in 6c30926
|
| mlir::MLIRContext *context) { | ||
| if (!context) { | ||
| context = moduleOp.getContext(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to context as a function argument? Can we just do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, if you have an op/attr/etc then you can always get context from that - only time to have a context arg is when there's no mandatory other arg that has it (for example, an array of attrs that may be empty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somehow thought the usage in LLVMCPUTarget.cpp uses two different MLIRContext, which turned out to be untrue. It's fixed now.
|
|
||
| llvm::MCSubtargetInfo const *subTargetInfo = | ||
| targetMachine.getMCSubtargetInfo(); | ||
|
|
||
| const std::vector<llvm::SubtargetFeatureKV> enabledFeatures = | ||
| subTargetInfo->getEnabledProcessorFeatures(); | ||
|
|
||
| auto plussedFeatures = llvm::to_vector( | ||
| llvm::map_range(enabledFeatures, [](llvm::SubtargetFeatureKV feature) { | ||
| return std::string("+") + feature.Key; | ||
| })); | ||
|
|
||
| auto plussedFeaturesRefs = llvm::to_vector(llvm::map_range( | ||
| plussedFeatures, [](auto &it) { return StringRef(it.c_str()); })); | ||
|
|
||
| auto fullTargetFeaturesAttr = | ||
| LLVM::TargetFeaturesAttr::get(context, plussedFeaturesRefs); | ||
|
|
||
| StringRef targetCPU = targetMachine.getTargetCPU(); | ||
|
|
||
| Block &bodyBlock = moduleOp.getBodyRegion().front(); | ||
| for (auto funcOp : bodyBlock.getOps<LLVM::LLVMFuncOp>()) { | ||
| if (!funcOp.getTargetFeatures().has_value()) { | ||
| funcOp.setTargetFeaturesAttr(fullTargetFeaturesAttr); | ||
| } | ||
| if (!funcOp.getTargetCpu().has_value() && !targetCPU.empty()) { | ||
| funcOp.setTargetCpuAttr(StringAttr::get(context, targetCPU)); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: I'd remove the blank lines. We don't add blank line right after each statement.
Use vertical whitespace sparingly; unnecessary blank lines make it harder to see overall code structure. Use blank lines only where they aid the reader in understanding the structure.
https://google.github.io/styleguide/cppguide.html#Vertical_Whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I've removed most of the newlines in this block.
oh my question was actually the opposite: since the target triple in the test is riscv64, it'll use LLVM RISC-V backend to generate RISC-V assembly code. But RISC-V LLVM backend might not always be built -- in fact, any of the target including X86 is optional. So I'm wondering if there is a way to enable the test only when RISC-V LLVM backend was built |
I think IREE_ARCH is default to the host architecture. While it is true that when the host is RISC-V, RISC-V LLVM backend is definitely enabled, we also want to run this test like we're using iree-compile to cross-compile a RISC-V binary. |
Signed-off-by: Min Hsu <[email protected]>
Signed-off-by: Min Hsu <[email protected]>
Signed-off-by: Min Hsu <[email protected]>
Signed-off-by: Min Hsu <[email protected]>
227a2ec to
db2e6bf
Compare
|
I decided to enable the test, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see something similar we did for rocm the in the AnnotateKernelForTranslation pass. https://github.com/iree-org/iree/blob/33b75b996bfa66547108768b4fe70294c5915e9c/compiler/src/iree/compiler/Codegen/LLVMGPU/ROCDLAnnotateKernelForTranslation.cpp These attributes can be set at the llvm dialect level -- it would be much easier to test this as a pass instead of a utility function inside the target code: https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/LLVMGPU/test/ROCDL/annotate_kernel_for_translation.mlir
kuhar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea, would it also work for amdgpu?
iree/compiler/plugins/target/ROCM/ROCMTarget.cpp
Lines 792 to 793 in 33b75b9
| R"TXT(; To reproduce the .rocmasm from .optimized.ll, run: | |
| ; llc -mtriple={} -mcpu={} -mattr='{}' -O3 <.optimized.ll> -o <out.rocmasm> |
compiler/src/iree/compiler/Dialect/HAL/Utils/LLVMCodeGenUtils.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Dialect/HAL/Utils/LLVMCodeGenUtils.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Dialect/HAL/Utils/LLVMCodeGenUtils.cpp
Outdated
Show resolved
Hide resolved
Good point. The annotation in this patch is also done on LLVM dialect, but I guess the problem here is that the user -- |
Signed-off-by: Min Hsu <[email protected]>
Signed-off-by: Min Hsu <[email protected]> Co-Authored-By: Jakub Kuderski <[email protected]>
Thank you, and yes I think it works any target that lowers to LLVM. |
|
Alright, I've made another major revision and created a new Pass to test the newly added utility function. Since this Pass runs on LLVMIR dialect, it should be independent to the list of enabled LLVM backends. I think this is a cleaner solution. |
Signed-off-by: Min Hsu <[email protected]>
kuhar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just some minor coding style issues
compiler/plugins/target/LLVMCPU/test/propagate_target_attrs.mlir
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Dialect/HAL/Utils/LLVMCodeGenUtils.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Min Hsu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % nit
compiler/plugins/target/LLVMCPU/test/propagate_target_attrs.mlir
Outdated
Show resolved
Hide resolved
Signed-off-by: Min Hsu <[email protected]>
|
Could you help me to launch the CI again? I pushed a new commit to address review feedback and the linter error. |
Signed-off-by: Min Hsu <[email protected]>
|
The bazel build error is now fixed. |
Signed-off-by: Min Hsu <[email protected]>
Rather than relying on the transive dependency from HAL::HALDialect Signed-off-by: Min Hsu <[email protected]>
Signed-off-by: Min Hsu <[email protected]>
|
It took me some time to fix all the bazel build failures but I believe all the CIs have passed now. I'm not sure what are those PkgCI though. Could anyone help me to merge this patch if we can skip those PkgCI? |
|
Yeah the remaining tests are unrelated. We have one blocking review from @benvanik that prevents merging. |
|
ping @benvanik do you have any other comments? |
|
I clicked the merge button because of it's been approved for a week, but if Ben comes back with come comments we can fix forward. |
thank you |
…ncOp (iree-org#22036) There are many cases where we want to run LLVM tools like `opt` and `llc` through the LLVM IR files from `--iree-hal-dump-executable-intermediates-to`. Instead of using this command: ``` $ llc -mcpu=foo -mattr='+a,+b,+c,+d,...' dump/foo.codegen.ll ``` Which could be annoyed especially for RISC-V because it can have a (very) long feature list. It will be nice if we could just ``` $ llc dump/foo.codegen.ll ``` This patch does this by propagating the target features and CPU name from module / TargetMachine to the `target-features` and `target-cpu` attributes of individual functions. It is also worth noting that many LLVM frontend like Clang and Flang also do the same attribute propagation. --------- Signed-off-by: Min Hsu <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]> Signed-off-by: Philipp <[email protected]>
…l LLVMFuncOp" (#22488) Reverts #22036 The commit seems to be the root cause of the illegal CPU instructions. We produce different codegen artifact with and without the change. To repro: ``` iree-compile ~/repro.mlir -o /tmp/z.vmfb --iree-hal-dump-executable-binaries-to=/tmp/z` md5sum /tmp/z/module_jit_eval_dispatch_0_embedded_elf_x86_64.so ``` `repro.mlir`: https://gist.github.com/hanhanW/43ffdd9a84144ec4fc8e4b5a5b450f7e Without the change: the md5sum is `d5462d72f7ddc97c1a1c28e65cf614ff`. With the change, the md5sum is `8edeadc38d332b514bd36102186a5b2a`. The author agreed on reverting the change; will help investigate it later: https://discord.com/channels/689900678990135345/689957613152239638/1433570268307132589
…l LLVMFuncOp" (iree-org#22488) Reverts iree-org#22036 The commit seems to be the root cause of the illegal CPU instructions. We produce different codegen artifact with and without the change. To repro: ``` iree-compile ~/repro.mlir -o /tmp/z.vmfb --iree-hal-dump-executable-binaries-to=/tmp/z` md5sum /tmp/z/module_jit_eval_dispatch_0_embedded_elf_x86_64.so ``` `repro.mlir`: https://gist.github.com/hanhanW/43ffdd9a84144ec4fc8e4b5a5b450f7e Without the change: the md5sum is `d5462d72f7ddc97c1a1c28e65cf614ff`. With the change, the md5sum is `8edeadc38d332b514bd36102186a5b2a`. The author agreed on reverting the change; will help investigate it later: https://discord.com/channels/689900678990135345/689957613152239638/1433570268307132589
…ncOp (iree-org#22036) There are many cases where we want to run LLVM tools like `opt` and `llc` through the LLVM IR files from `--iree-hal-dump-executable-intermediates-to`. Instead of using this command: ``` $ llc -mcpu=foo -mattr='+a,+b,+c,+d,...' dump/foo.codegen.ll ``` Which could be annoyed especially for RISC-V because it can have a (very) long feature list. It will be nice if we could just ``` $ llc dump/foo.codegen.ll ``` This patch does this by propagating the target features and CPU name from module / TargetMachine to the `target-features` and `target-cpu` attributes of individual functions. It is also worth noting that many LLVM frontend like Clang and Flang also do the same attribute propagation. --------- Signed-off-by: Min Hsu <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]>
…l LLVMFuncOp" (iree-org#22488) Reverts iree-org#22036 The commit seems to be the root cause of the illegal CPU instructions. We produce different codegen artifact with and without the change. To repro: ``` iree-compile ~/repro.mlir -o /tmp/z.vmfb --iree-hal-dump-executable-binaries-to=/tmp/z` md5sum /tmp/z/module_jit_eval_dispatch_0_embedded_elf_x86_64.so ``` `repro.mlir`: https://gist.github.com/hanhanW/43ffdd9a84144ec4fc8e4b5a5b450f7e Without the change: the md5sum is `d5462d72f7ddc97c1a1c28e65cf614ff`. With the change, the md5sum is `8edeadc38d332b514bd36102186a5b2a`. The author agreed on reverting the change; will help investigate it later: https://discord.com/channels/689900678990135345/689957613152239638/1433570268307132589
There are many cases where we want to run LLVM tools like
optandllcthrough the LLVM IR files from--iree-hal-dump-executable-intermediates-to. Instead of using this command:Which could be annoyed especially for RISC-V because it can have a (very) long feature list. It will be nice if we could just
This patch does this by propagating the target features and CPU name from module / TargetMachine to the
target-featuresandtarget-cpuattributes of individual functions.It is also worth noting that many LLVM frontend like Clang and Flang also do the same attribute propagation.