-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[llvm][IR] Treat memcmp and bcmp as libcalls #135706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/ilovepi/bcmp-libcall-precommit
Are you sure you want to change the base?
[llvm][IR] Treat memcmp and bcmp as libcalls #135706
Conversation
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-lto Author: Paul Kirth (ilovepi) ChangesSince the backend may emit calls to these functions, they should be See https://discourse.llvm.org/t/rfc-addressing-deficiencies-in-llvm-s-lto-implementation/84999 Full diff: https://github.com/llvm/llvm-project/pull/135706.diff 2 Files Affected:
diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.def b/llvm/include/llvm/IR/RuntimeLibcalls.def
index 2545aebc73391..2c72bc8c012cc 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.def
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.def
@@ -513,6 +513,8 @@ HANDLE_LIBCALL(UO_PPCF128, "__gcc_qunord")
HANDLE_LIBCALL(MEMCPY, "memcpy")
HANDLE_LIBCALL(MEMMOVE, "memmove")
HANDLE_LIBCALL(MEMSET, "memset")
+HANDLE_LIBCALL(MEMCMP, "memcmp")
+HANDLE_LIBCALL(BCMP, "bcmp")
// DSEPass can emit calloc if it finds a pair of malloc/memset
HANDLE_LIBCALL(CALLOC, "calloc")
HANDLE_LIBCALL(BZERO, nullptr)
diff --git a/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll b/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll
index 4c6bebf69a074..80421cd9350c8 100644
--- a/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll
+++ b/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll
@@ -29,8 +29,7 @@ define i1 @foo(ptr %0, [2 x i32] %1) {
declare i32 @memcmp(ptr, ptr, i32)
;; Ensure bcmp is removed from module. Follow up patches can address this.
-; INTERNALIZE-NOT: declare{{.*}}i32 @bcmp
-; INTERNALIZE-NOT: define{{.*}}i32 @bcmp
+; INTERNALIZE: define{{.*}}i32 @bcmp
define i32 @bcmp(ptr %0, ptr %1, i32 %2) {
ret i32 0
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised these missing libcalls have been missing for so long without getting fixed
I don't really want to start adding functions to RuntimeLibcalls.def piecemeal without documented criteria for what, exactly, should be added. Do we need to add every single function that any transformation can generate under any circumstances? Or is there some criteria we can use to restrict this? I mean, we do memcmp->bcmp, yes, but we also touch a bunch of other math and I/O functions. Do we need to add all the functions from BuildLibCalls.h? Certain libcalls are special because we can generate calls to them even with -fno-builtins: there is no alternative implementation. Like for memcpy, or __stack_chk_fail, or floating-point arithmetic on soft-float targets. memcmp isn't special like this. |
I think that any function that can get added after you've potentially deleted its definition needs to be handled the same way, otherwise you can end up w/ the same kind of bugs. Adding all the functions from BuildLibCalls.h seems roughly correct, since I don't recall running into anything that fails this way that isn't either on that list or in the list of RuntimeLibcalls. |
Since the backend may emit calls to these functions, they should be treated like other libcalls. If we don't, then it is possible to have their definitions removed during LTO because they are dead, only to have a later transform introduce calls to them. See https://discourse.llvm.org/t/rfc-addressing-deficiencies-in-llvm-s-lto-implementation/84999 for more information.
af02216
to
4b8f422
Compare
@efriedma-quic I guess I should ask if you're opposed to us adding these to the RuntimeLibcalls? I agree that we should have some criteria spelled out, but I'm not sure we have that pinned down well enough just yet. Also, I don't see much spelled out either in our docs or comments about the mechanisms here. Do we have anything, or maybe a thread from dicourse/mailing list that hashed some of this out? I'd like to make sure we have that kind of thing written down somewhere. |
I think few people are doing LTO w/ things that provide bcmp/memcmp, like libc. Typically ,what I see is that even when they're statically linked, like for embeded code, they're not built w/ LTO, so its just a normal library, and not participating in LTO. |
There are, currently, basically three different ways to supply libc which we support:
To do "LTO of libc", you want something else... but you need to define what "something else" is. There needs to be a clearly delineated point at which libc implementation bits transition from an abstraction to concrete code. If your libc cooperates, that point can be different for different interfaces, but there still needs to be a specific point. You can't just blindly throw everything at the existing optimizer and hope for the best. If you say memcmp stays an abstraction past codegen, you're getting basically zero benefit from LTO'ing it, as far as I can tell: the caller is opaque to the callee, and the callee is opaque to the caller. At best. At worst, your code explodes because your memcmp implementation depends on runtime CPU detection code that runs startup, and LTO can't understand the dependency between the detection code and memcmp. So in essence, I feel like can't review this patch without a proposal that addresses where we're going overall. |
First, thanks for the context. I don't see anything like this written down, so I plan to find some place in our docs to put those details. I'll be sure to CC you and other folks I think will have thoughts on the precise verbiage. The compiler's contract with libc is, from what I can tell, complicated, under specified, and mostly undocumented. Having spoke w/ some libc folks about libc semantics in the past, I don't think it will be easy to pin down all the details to the extent we want. I think writing down what you put above is just the first step. Maybe part of the issue is that I don't see a fundamental reason why libc is special beyond a few key things:
I'm probably neglecting something obvious in that short list, but for most things, I don't think anything special needs to happen. What shouldn't happen though, is that the compiler deletes a function definition, and then reintroduces a call to that function ... maybe that's what you mean by "staying an abstraction past codegen"? I didn't initially read it that way, but I guess in that light I see where you're coming from. Put another way, I think its strictly a bug in our phase ordering to allow functions to be deleted if they may have calls introduced again. Since memcmp/bcmp are special this way(as are the existing libcalls), I guess maybe that's part of the problem. I was kind of under the impression that RuntimeLibcalls was our mechanism for handling that, though. As for making a libc cooperate w/ the compiler, perhaps there is a set of attributes we could use (or introduce?). We already have a few of these (attribute So, I guess let me try to explain my expectations for how we'd like the compiler to behave when LTOing a program along w/ libc. Mostly, we don't want the compiler to change its default behavior. So when it sees a call to |
We need to enter the "-fno-builtins" world to make interprocedural optimizations with libc safe. Most optimizations you care about can be preserved in other ways. For example, if malloc called some intrinsic "llvm.allocate_memory"/"llvm.free_memory" to create/destroy provenance, we can preserve most aliasing-related optimizations. If your libc does runtime CPU detection, we can come up with some way to accurately model aliasing on those globals. But we need a different IR representation to make this work; we can't just treat the implementations as opaque. If you want to run certain optimizations before we enter the "-fno-builtins" world, you need some pass that transitions IR from the "builtins" world to the "nobuiltins" world. It might be possible for us to invent a "partial-builtin" mode which treats functions which are called as builtins, but doesn't allow generating calls to functions which aren't already used. Which would allow LTO to accurately to more accurately compute which libc functions are used. But I'm not sure how useful this would actually be in practice; if you're not LTO'ing libc, the dependencies don't really need to be accurate. There's a smaller set of functions which have more subtle ABI rules: those we call even with -fno-builtins. These are mostly listed in RuntimeLibcalls.def. But memcmp is not one of those functions. |
hmm, that's an interesting direction. We were discussing this internally, and we were outlining some ideas along these lines, but I think you've articulated this quite a bit better than we have so far. I really like this idea of a "no-builtins" world, and transitioning the IR. I also find the idea of "partial-builtins" to be quite compelling, though I agree the usefulness maybe limited to scenarios where you're supplying a libc to LTO. Given that we're often dealing w/ kernel and embedded code, though, I think this is worth exporing more. I plan to discuss this a bit more w/ my team today, and hopefully write up something a bit more cogent than my earlier rambling. @mysterymath and @frobtech may have more to say as well. |
@efriedma-quic We've been discussing the topic of LTOing libc quite a bit internally, and are currently sketching out how this could work. Unsurprisingly, there's quite a lot to think about in both the compiler and linker, and how the two combine in our different versions of LTO, and how that may break in new and fun ways. I was wondering if you'd be available to join the libc monthly meeting (5/8 9am PST) to discuss your take on the whole idea? I'm not sure what time zone you're normally in, but I think there will be quite a number of folks who are interested in making LTO work well with libc's and LLVM libc in particular. I know a few of my team members would like to pick your brain on the subject, as we're sketching our ideas out. I can try to post a short summary of our thoughts either here or on discouse to make the discussion a bit easier as well. |
I can join libc monthly meeting, sure. |
Great. Looking forward to discussing this then. I'll add the topic to the meeting agenda. |
Since the backend may emit calls to these functions, they should be
treated like other libcalls. If we don't, then it is possible to
have their definitions removed during LTO because they are dead, only to
have a later transform introduce calls to them.
See https://discourse.llvm.org/t/rfc-addressing-deficiencies-in-llvm-s-lto-implementation/84999
for more information.