Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[flang] Inherit target specific code for BIND(C) types on Windows #129579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mmuetzel
Copy link
Contributor

@mmuetzel mmuetzel commented Mar 3, 2025

Inherit target specific code for Windows i386 and x86_64 from the classes that define that code for the respective processors on non-Windows operating systems.
Only overload parts that differ.

That allows re-using the existing implementation for BIND(C) types on non-Windows x86_64 also for Windows x86_64 targets.

Fixes #114035.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:codegen labels Mar 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 3, 2025

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-codegen

Author: Markus Mützel (mmuetzel)

Changes

Inherit target specific code for Windows i386 and x86_64 from the classes that define that code for the respective processors on non-Windows operating systems.
Only overload parts that differ.

That allows re-using the existing implementation for BIND(C) types on non-Windows x86_64 also for Windows x86_64 targets.


Full diff: https://github.com/llvm/llvm-project/pull/129579.diff

1 Files Affected:

  • (modified) flang/lib/Optimizer/CodeGen/Target.cpp (+4-8)
diff --git a/flang/lib/Optimizer/CodeGen/Target.cpp b/flang/lib/Optimizer/CodeGen/Target.cpp
index 2a1eb0bc33f5c..b03c1dd492ab0 100644
--- a/flang/lib/Optimizer/CodeGen/Target.cpp
+++ b/flang/lib/Optimizer/CodeGen/Target.cpp
@@ -199,10 +199,8 @@ struct TargetI386 : public GenericTarget<TargetI386> {
 //===----------------------------------------------------------------------===//
 
 namespace {
-struct TargetI386Win : public GenericTarget<TargetI386Win> {
-  using GenericTarget::GenericTarget;
-
-  static constexpr int defaultWidth = 32;
+struct TargetI386Win : public TargetI386 {
+  using TargetI386::TargetI386;
 
   CodeGenSpecifics::Marshalling
   complexArgumentType(mlir::Location loc, mlir::Type eleTy) const override {
@@ -718,10 +716,8 @@ struct TargetX86_64 : public GenericTarget<TargetX86_64> {
 //===----------------------------------------------------------------------===//
 
 namespace {
-struct TargetX86_64Win : public GenericTarget<TargetX86_64Win> {
-  using GenericTarget::GenericTarget;
-
-  static constexpr int defaultWidth = 64;
+struct TargetX86_64Win : public TargetX86_64 {
+  using TargetX86_64::TargetX86_64;
 
   CodeGenSpecifics::Marshalling
   complexArgumentType(mlir::Location loc, mlir::Type eleTy) const override {

@kiranchandramohan
Copy link
Contributor

If there is no functional change, can you add NFC to the patch title?

Should we have an AArch64Win as well?

@DavidTruby
Copy link
Member

As far as I am aware, the BIND(C) ABI for AArch64 is the same on Windows and Linux (at least as far as Fortran is concerned; it is subtly different in C++ but in ways that don't affect Fortran) so I don't believe there's any reason to have an AArch64Win here.

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

Thank you for checking this PR.

If there is no functional change, can you add NFC to the patch title?

While there is no functional change by this change for most targets, there is one for Windows x86_64 targets.
Without this change, trying to use BIND(C) types targeting Windows x86_64 leads to a compilation error like the following with Flang (see #114035):

not yet implemented: passing VALUE BIND(C) derived type for this target

With this change, the existing implementation for x86_64 is also used when targeting Windows x86_64.

This is essentially a follow up on 774703e. Back when I originally proposed those changes, the TargetX86_64Win and TargetI386Win classes provided implementations for the same set of member functions of the TargetX86_64 and TargetI386 classes, respectively.

Since then, a few more member functions have been added to TargetX86_64. There is no equivalent for these member functions in the TargetX86_64Win class now.

Instead of essentially copying most of the source code that already exists in the TargetX86_64 class to the TargetX86_64Win class, I figured that it would make sense now to inherit from the TargetX86_64 class instead and only keep overloads for the member functions that need to be different on Windows.

This change doesn't add any new implementation though. It just re-uses the existing implementation. Does that mean that NFC should still be added to the patch title?

In any case, this is the first time I'm trying to contribute to LLVM since it moved to PRs on GitHub. Please, let me know if I'm doing something wrong.

@kiranchandramohan
Copy link
Contributor

kiranchandramohan commented Mar 6, 2025

If there is a functional change please add a test.

Thanks for contributing. Always welcome. The convention is to add a test if there is a functional change otherwise if there is no functional change then add NFC to the title.

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

Are there tests for BIND(C) on non-Windows that I could use as an "inspiration"?

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

Maybe, I could just "piggyback" the tests added in 011ba72 and/or 367c3c9 and add the following line close to their top:

// RUN: fir-opt -target-rewrite="target=x86_64-w64-windows-gnu" %s -o - | FileCheck %s

Would that work?

Inherit target specific code for Windows i386 and x86_64 from the classes that
define that code for the respective processors on non-Windows operating
systems.
Only overload parts that differ.

That allows re-using the existing implementation for BIND(C) types on
non-Windows x86_64 also for Windows x86_64 targets.

Fixes llvm#114035.
@mmuetzel mmuetzel changed the title [flang] Inherit target specific code for BIND(C) types on Windows (#114035) [flang] Inherit target specific code for BIND(C) types on Windows Mar 6, 2025
@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

It looks like the pre-commit CI is still passing with those tests running for a Windows x86_64 target.

Is that good enough?

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at godbolt with C examples, it seems Windows 64 does not have the same ABI when it comes to passing structs by value, it seems they must be passed in memory in more cases on windows: https://godbolt.org/z/sYMP7T3rr

Can you point to the ABI specification document link (I am not familiar with windows ABI to double check from the documentation)?

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

Oof. I didn't think about that.
That probably means that it isn't ok to just use the existing implementation.

Windows x64 ABI conventions are documented here (I guess): https://learn.microsoft.com/en-us/cpp/build/x64-software-conventions?view=msvc-170

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

Maybe, a relevant part:

__m128 types, arrays, and strings are never passed by immediate value. Instead, a pointer is passed to memory allocated by the caller. Structs and unions of size 8, 16, 32, or 64 bits, and __m64 types, are passed as if they were integers of the same size. Structs or unions of other sizes are passed as a pointer to memory allocated by the caller. For these aggregate types passed as a pointer, including __m128, the caller-allocated temporary memory must be 16-byte aligned.

I don't think I'll be able to implement that (unless with a lot of hand-holding).

I'd very much appreciate if someone else could take that on. In which case, feel free to close this.

@mmuetzel mmuetzel marked this pull request as draft March 6, 2025 13:50
@jeanPerier
Copy link
Contributor

Structs and unions of size 8, 16, 32, or 64 bits, and __m64 types, are passed as if they were integers of the same size. Structs or unions of other sizes are passed as a pointer to memory allocated by the caller. For these aggregate types passed as a pointer, including __m128, the caller-allocated temporary memory must be 16-byte aligned.

It actually looks like a pretty straightforward ABI. I am reluctant to implement it myself because I cannot test end-to-end and I am not familiar with windows environment, but here is what the implementation could be below.

Two things need to be clarified by someone familiar with the windows 64 ABI and clang:

  • why clang does not set the byval attribute when passing the struct in memory (like it is done for linux)?
  • should the alignment of struct in memory be 16 like required when passing arguments, or the default alignment for the type?
diff --git a/flang/lib/Optimizer/CodeGen/Target.cpp b/flang/lib/Optimizer/CodeGen/Target.cpp
index e2f8fb9d239a..a22c82316a4d 100644
--- a/flang/lib/Optimizer/CodeGen/Target.cpp
+++ b/flang/lib/Optimizer/CodeGen/Target.cpp
@@ -780,6 +780,45 @@ struct TargetX86_64Win : public GenericTarget<TargetX86_64Win> {
     }
     return marshal;
   }
+
+  CodeGenSpecifics::Marshalling
+  structArgumentType(mlir::Location loc, fir::RecordType type,
+                     const Marshalling &) const override {
+    std::uint64_t size =
+        fir::getTypeSizeAndAlignmentOrCrash(loc, type, getDataLayout(), kindMap)
+            .first;
+    CodeGenSpecifics::Marshalling marshal;
+    if (size <= 8)
+      marshal.emplace_back(mlir::IntegerType::get(type.getContext(), size * 8),
+                           AT{});
+    else
+      // TODO: investigate: clang is not setting the byval attribute, it is not
+      // clear why. Currently, this needs to be set here for flang so that the
+      // target-rewrite pass allocate memory as expected. Is it OK to set byval
+      // when clang does not?
+      marshal.emplace_back(fir::ReferenceType::get(type),
+                           AT{16, /*byval=*/true, /*sret=*/false});
+    return marshal;
+  }
+
+  CodeGenSpecifics::Marshalling
+  structReturnType(mlir::Location loc, fir::RecordType type) const override {
+    std::uint64_t size =
+        fir::getTypeSizeAndAlignmentOrCrash(loc, type, getDataLayout(), kindMap)
+            .first;
+    CodeGenSpecifics::Marshalling marshal;
+    if (size <= 8)
+      marshal.emplace_back(mlir::IntegerType::get(type.getContext(), size * 8),
+                           AT{});
+    else
+      // TODO: investigate: the ABI is not very clear about the alignment for
+      // the return in memory (while it was explicit about 16 for the argument
+      // passing case). Should it be the default alignment for that type
+      // instead?
+      marshal.emplace_back(fir::ReferenceType::get(type),
+                           AT{16, /*byval=*/false, /*sret=*/true});
+    return marshal;
+  }
 };
 } // namespace

I validated on simple example that it does what I would expect reading the ABI:

module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"} {

// Cases where the type must be passed/returned in register.

func.func @test_t1(%0 : !t1) -> () {
  return
}

func.func @test_call_t1(%0 : !fir.ref<!t1>) {
  %1 = fir.load %0 : !fir.ref<!t1>
  fir.call @test_t1(%1)  : (!t1) -> ()
  return
}

func.func @test_return_t1(%0 : !fir.ref<!t1>) -> !t1 {
  %1 = fir.load %0 : !fir.ref<!t1>
  return %1 : !t1
}

func.func @test_call_return_t1(%0 : !fir.ref<!t1>) -> () {
  %1 = fir.call @test_return_t1(%0)  : (!fir.ref<!t1>) -> !t1
  fir.store %1 to %0 : !fir.ref<!t1>
  return
}

// Cases where the type must be passed/returned on the stack.

func.func @test_call_t2(%0 : !fir.ref<!t2>) {
  %1 = fir.load %0 : !fir.ref<!t2>
  fir.call @test_t2(%1)  : (!t2) -> ()
  return
}
func.func @test_t2(%0 : !t2) -> () {
  return
}

func.func @test_return_t2(%0 : !fir.ref<!t2>) -> !t2 {
  %1 = fir.load %0 : !fir.ref<!t2>
  return %1 : !t2
}

func.func @test_call_return_t2(%0 : !fir.ref<!t2>) -> () {
  %1 = fir.call @test_return_t2(%0)  : (!fir.ref<!t2>) -> !t2
  fir.store %1 to %0 : !fir.ref<!t2>
  return
}
}

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

Thank you very much.

Would it be ok to include the diff (with the TODO comments) in this PR? Maybe listing you as the author?
I'll be away from a keyboard for a week. So, I won't be very responsive in that time. So, please take over if that is ok for you.
I could help testing on Windows after that. But I'm by far no Windows ABI expert.

I hope it is ok if I ping @mstorsjo because I think he might be able to answer your questions regarding argument passing.

@jeanPerier
Copy link
Contributor

Would it be ok to include the diff (with the TODO comments) in this PR? Maybe listing you as the author?

Sure, no problem with that.

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Mar 6, 2025

Thank you.

I haven't updated the tests yet. (So, they'll probably fail now that the implementation is "more correct".)
I can try to update them in a week or so if it is clear by then what should actually happen. (But I wouldn't mind if someone else would like to do that before.)

I figured that it could still be helpful to have these changes in the PR to make it easier to discuss them.

@mstorsjo
Copy link
Member

mstorsjo commented Mar 6, 2025

Thank you very much.

Would it be ok to include the diff (with the TODO comments) in this PR? Maybe listing you as the author? I'll be away from a keyboard for a week. So, I won't be very responsive in that time. So, please take over if that is ok for you. I could help testing on Windows after that. But I'm by far no Windows ABI expert.

I hope it is ok if I ping @mstorsjo because I think he might be able to answer your questions regarding argument passing.

Sorry, I don't really know much about this level of how these aspects are modelled in the LLVM IR. I think @rnk or maybe @efriedma-quic may know better (answers to the questions above in #129579 (comment)).

@efriedma-quic
Copy link
Collaborator

why clang does not set the byval attribute when passing the struct in memory (like it is done for linux)?

"byval" is weird: it actually allocates memory on the stack, as part of the argument list, then copies the argument into that memory. Old targets sometimes pass memory like this. The designers for more modern ABIs have recognized that passing around large values like that isn't a good idea, so they don't do that; they pass a pointer to a temporary allocated on the stack. Try looking at the assembly on x86_64-linux-gnu vs. x86_64-windows-msvc for struct S { int x[100]; }; void f(struct S); void g() { f((struct S){}); } if you're confused.

should the alignment of struct in memory be 16 like required when passing arguments, or the default alignment for the type?

The "caller-allocated temporary memory must be 16-byte aligned" specifically only applies to calls; in other contexts, normal alignment rules apply.

@jeanPerier
Copy link
Contributor

why clang does not set the byval attribute when passing the struct in memory (like it is done for linux)?

"byval" is weird: it actually allocates memory on the stack, as part of the argument list, then copies the argument into that memory. Old targets sometimes pass memory like this. The designers for more modern ABIs have recognized that passing around large values like that isn't a good idea, so they don't do that; they pass a pointer to a temporary allocated on the stack. Try looking at the assembly on x86_64-linux-gnu vs. x86_64-windows-msvc for struct S { int x[100]; }; void f(struct S); void g() { f((struct S){}); } if you're confused.

Thanks @efriedma-quic, that clarified the difference for me! Looking at the callee side, I actually see now that for x86_64-linux-gnu, the callee is looking for the argument in memory at specific %rsp + offset, while windows fetches the memory from the argument registers like for normal pointer arguments.

The "caller-allocated temporary memory must be 16-byte aligned" specifically only applies to calls; in other contexts, normal alignment rules apply.

My question was more specific to returned struct (sret). Looking at the ABI spec, I do not see a specific requirement for the alignment of the return value https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#return-values, does that mean normal alignment rules apply for the hidden argument for the result, while it is always 16 for struct value argument passed in memory?

Also, looking at LLVM IR emitted by clang for x86_64-windows-msvc, I see that memory allocated for struct value argument is not always 16 byte aligned (see the %2 = alloca %struct.S, align 4 in the godbolt LLVM IR for your example). Is this a bug on clang side, or am I misreading https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#parameter-passing requirements?

@efriedma-quic
Copy link
Collaborator

does that mean normal alignment rules apply for the hidden argument for the result, while it is always 16 for struct value argument passed in memory?

The spec also sometimes has holes, so probably we'd want to investigate MSVC code generation a bit. But that's my reading.

Also, looking at LLVM IR emitted by clang for x86_64-windows-msvc, I see that memory allocated for struct value argument is not always 16 byte aligned (see the %2 = alloca %struct.S, align 4 in the godbolt LLVM IR for your example). Is this a bug on clang side

That's probably a bug? At least, that's my reading of the spec; not sure what happens in practice.

@MehdiChinoune
Copy link
Contributor

Any progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:codegen flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flang] Support for "passing VALUE BIND(C) derived type for this target" on the road map?
8 participants