-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[DSE] Mark promise of pre-split coroutine visible to caller #133918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allocas are destroyed when returning from functions. However, this is not the case for pre-split coroutines. Any premature elimination will lead to side effects. Fix 123347
Request code review from @nikic and @ChuanqiXu9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some more detail to the issue description why coroutine semantics require this? This hack looks very problematic to me, and seems quite distinct from the existing coroutine workarounds we have (which are about the possibility of the thread identity changing across suspension points).
@llvm/pr-subscribers-llvm-transforms Author: None (NewSigma) ChangesAllocas are destroyed when returning from functions. However, this is not the case for pre-split coroutines, because coroutine frame should be visible to caller. For example, one can write to the coroutine's promise, suspend, and later read from the caller. Eliminating such stores would introduce side effects. This commit forces that all allocas of pre-split coroutines remain visible to the caller. While this may miss some optimization opportunities, correctness takes priority. Future work could analyze the lifetimes of allocas if performance regressions become significant. Fix #123347 Full diff: https://github.com/llvm/llvm-project/pull/133918.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 935f21fd484f3..780b64e70136f 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -1194,7 +1194,9 @@ struct DSEState {
bool isInvisibleToCallerAfterRet(const Value *V) {
if (isa<AllocaInst>(V))
- return true;
+ // Defer alloca store elimination, wait for CoroSplit
+ return !F.isPresplitCoroutine();
+
auto I = InvisibleToCallerAfterRet.insert({V, false});
if (I.second) {
if (!isInvisibleToCallerOnUnwind(V)) {
diff --git a/llvm/test/Transforms/DeadStoreElimination/coro-alloca.ll b/llvm/test/Transforms/DeadStoreElimination/coro-alloca.ll
new file mode 100644
index 0000000000000..ec9dc84f2c4ae
--- /dev/null
+++ b/llvm/test/Transforms/DeadStoreElimination/coro-alloca.ll
@@ -0,0 +1,33 @@
+; Test that store-load operation that crosses suspension point will not be eliminated by DSE before CoroSplit
+; RUN: opt < %s -passes='dse' -S | FileCheck %s
+
+define void @fn(ptr align 8 %arg) presplitcoroutine {
+ %promise = alloca ptr, align 8
+ %awaiter = alloca i8, align 1
+ %id = call token @llvm.coro.id(i32 16, ptr %promise, ptr @fn, ptr null)
+ %hdl = call ptr @llvm.coro.begin(token %id, ptr null)
+ %mem = call ptr @malloc(i64 1)
+ call void @llvm.lifetime.start.p0(i64 8, ptr %promise)
+ store ptr %mem, ptr %promise, align 8
+ %save = call token @llvm.coro.save(ptr null)
+ call void @llvm.coro.await.suspend.void(ptr %awaiter, ptr %hdl, ptr @await_suspend_wrapper_void)
+ %sp = call i8 @llvm.coro.suspend(token %save, i1 false)
+ %flag = icmp ule i8 %sp, 1
+ br i1 %flag, label %resume, label %suspend
+
+resume:
+ call void @llvm.lifetime.end.p0(i64 8, ptr %promise)
+ br label %suspend
+
+suspend:
+ call i1 @llvm.coro.end(ptr null, i1 false, token none)
+ %temp = load ptr, ptr %promise, align 8
+ store ptr %temp, ptr %arg, align 8
+; store when suspend, load when resume
+; CHECK: store ptr null, ptr %promise, align 8
+ store ptr null, ptr %promise, align 8
+ ret void
+}
+
+declare ptr @malloc(i64)
+declare void @await_suspend_wrapper_void(ptr, ptr)
|
Thanks. I updated my issue description. |
Could you elaborate this? e.g, give a example to describe why it is problematic. I didn't understand it.
While I didn't understand the problem, my instinct reaction is, even if we want to do something like this, maybe we can only do this for special allocas, like the promise alloca. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
It seems that DSE does not recognize that the coro frame is visible to the caller. It incorrectly eliminates stores right before returning to the caller, even though these values will be used upon resumption."
Yes, this is a safer choice. |
I will try to elaborate on my concerns about the issue. Hope you will understand. :) Consider the following example: resume:
; Do something with %promise
br label %subpend
suspend:
store ptr null, ptr %promise, align 8 ; Do not eliminate
ret void DSE will treat the function as a normal function, and stores just before 'ret' will be eliminated. This is done by eliminateDeadWritesAtEndOfFunction() in DSE, whose comments reads:
I have two questions:
If we consider that function ends each time coroutine suspends, then eliminateDeadWritesAtEndOfFunction() cannot be applied to pre-split coroutines and should be disabled. suspend:
store ptr null, ptr %promise, align 8 ; Do not eliminate
ret void resume:
; Do something with %promise
store ptr null, ptr %promise, align 8 ; Should eliminate
ret void Eager elimination introduces additional compilation costs and will not lead to more optimized code. Personally, I prefer to consider the pre-split coroutine promise visible to the caller, as this is safer than simply disabling eliminateDeadWritesAtEndOfFunction() for pre-split coroutines. However, this depends on your understanding of coroutine semantics.
Since the issue does nothing with thread_local storage or readnone functions, I consider it essentially different from thread identity changing problem. |
Out of curiosity, what's the corresponding pattern in C++ side? I mean, how can we see such pattern before coro-split? |
Coro result object conversion function that attempts to modify the promise shall produce the pattern. |
Makes sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please leave a few days to give @nikic a chance to take a look.
I don't have time to look deeply into this right now, but the change looks very concerning to me. Allocas becoming dead at the end of the function is a very core property of allocas. If it does not hold for the promise alloca, it probably should not be an alloca? Is it possible to use a different IR representation? |
It sounds make sense to use a different IR representation to address the concern. But I don't have a concrete plan and I feel we lack the human resource right now. Maybe we can mark this as an issue or a bug and asking for volunteers. for the problem itself, the problem actually may be the I don't have solution in mind now. I feel it is somewhat fundamental. For the patch itself, I feel it might be better to add FIXME and land it to stop bleeding. WDYT? |
Perhaps we can come up with a better solution. Let's close it for now. |
Currently DSE does not recognize that the coro frame is visible to the caller. It incorrectly eliminates stores right before returning to the caller, even though these values will be used upon resumption. This commit marks promise of pre-split coroutine visible to caller to avoid incorrect elimination.
Fix #105595