-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[clang] Fix nondeterminism in MemberPointerType #137910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit fixes the nondeterminism issue in C++ header module enabled builds which were observed after llvm#132401. The issue was related to the fact that the hash set operation in MemberPointerType::Profile() was triggering getMostRecentDecl(). The root cause seems to be that the latter was leading to the reentrant modification of the hash set, with some probability (likely depending on the actual values of hashes). We haven't been able to come up with a deterministic regression test for this fix.
@llvm/pr-subscribers-clang Author: Maksim Ivanov (emaxx-google) ChangesThis commit fixes the nondeterminism issue in C++ header module enabled builds which were observed after The issue was related to the fact that the hash set operation in MemberPointerType::Profile() was triggering getMostRecentDecl(). The root cause seems to be that the latter was leading to the reentrant modification of the hash set, with some probability (likely depending on the actual values of hashes). We haven't been able to come up with a deterministic regression test for this fix. Full diff: https://github.com/llvm/llvm-project/pull/137910.diff 2 Files Affected:
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index 3e1fb05ad537c..c93cbae767db8 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -3602,6 +3602,7 @@ class MemberPointerType : public Type, public llvm::FoldingSetNode {
}
NestedNameSpecifier *getQualifier() const { return Qualifier; }
+ CXXRecordDecl *getCXXRecordDecl() const;
CXXRecordDecl *getMostRecentCXXRecordDecl() const;
bool isSugared() const;
@@ -3610,7 +3611,7 @@ class MemberPointerType : public Type, public llvm::FoldingSetNode {
}
void Profile(llvm::FoldingSetNodeID &ID) {
- Profile(ID, getPointeeType(), getQualifier(), getMostRecentCXXRecordDecl());
+ Profile(ID, getPointeeType(), getQualifier(), getCXXRecordDecl());
}
static void Profile(llvm::FoldingSetNodeID &ID, QualType Pointee,
diff --git a/clang/lib/AST/Type.cpp b/clang/lib/AST/Type.cpp
index 111a642173418..a149eac7e4555 100644
--- a/clang/lib/AST/Type.cpp
+++ b/clang/lib/AST/Type.cpp
@@ -5275,10 +5275,14 @@ void MemberPointerType::Profile(llvm::FoldingSetNodeID &ID, QualType Pointee,
ID.AddPointer(Cls->getCanonicalDecl());
}
+CXXRecordDecl *MemberPointerType::getCXXRecordDecl() const {
+ return dyn_cast<MemberPointerType>(getCanonicalTypeInternal())
+ ->getQualifier()
+ ->getAsRecordDecl();
+}
+
CXXRecordDecl *MemberPointerType::getMostRecentCXXRecordDecl() const {
- auto *RD = dyn_cast<MemberPointerType>(getCanonicalTypeInternal())
- ->getQualifier()
- ->getAsRecordDecl();
+ auto *RD = getCXXRecordDecl();
if (!RD)
return nullptr;
return RD->getMostRecentNonInjectedDecl();
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear to me why this is a specifically a problem when called from Profile(...)
can you elaborate a bit more in the summary.
Does getMostRecentCXXRecordDecl(...)
require a comment that documents this behavior so that future users do not fall into the same issue?
@@ -3610,7 +3611,7 @@ class MemberPointerType : public Type, public llvm::FoldingSetNode { | |||
} | |||
|
|||
void Profile(llvm::FoldingSetNodeID &ID) { | |||
Profile(ID, getPointeeType(), getQualifier(), getMostRecentCXXRecordDecl()); | |||
Profile(ID, getPointeeType(), getQualifier(), getCXXRecordDecl()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the problem that we run this profile in the 'middle' of a TU and the most recent decl changes by the end? If so, this makes sense.
That said, I'd expect this to be the above suggestion in case this member pointer type has a redeclaration that perhaps changes what the actual referenced decl is.
Profile(ID, getPointeeType(), getQualifier(), getCXXRecordDecl()); | |
Profile(ID, getPointeeType(), getQualifier(), getCXXRecordDecl()->getCanonicalDecl()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this suggestion seems not needed because this getCanonicalDecl()
call is already made inside the Profile()
method:
llvm-project/clang/lib/AST/Type.cpp
Line 5279 in e8ae779
ID.AddPointer(Cls->getCanonicalDecl()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... then I'm surprised that this patch is needed at all for the purposes of non-determinism. getMostRecentCXXRecordDecl()->getCanonicalDecl()
will get the same declaration as getCXXRecordDecl()->getCanonicalDecl()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is transient - the reproducibility rate on the reproducer (https://pastebin.com/6aL6rmBe) is only 1%. Also it's not the pointer we have here that's problematic, but rather the fact of extra AST node loading inside the FoldingSet
hash calculation operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... then I'm surprised that this patch is needed at all for the purposes of non-determinism
My theory is that non-determinism is caused by side-effects in the deserialization logic that runs behind getMostRecentCXXRecordDecl()
. We probably just get unlucky with rehashing points at different times for the hash table (FoldingSet
) that makes us either call or not call getMostRecentCXXRecordDecl
.
It is not a new issue, just rare and now getting surfaced by the new place where getMostRecentCXXRecordDecl
is called.
I've added a bit more details - it's basically the fact that the "get most recent" operation triggered the load of new nodes from the PCM file. If all this happens while a hash for another node is calculated in At least that's the current theory, based on debugging the "deviating" executions of the compiler and seeing the
Sure, I've added a comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure I understand the problem this is solving, but I also don't see any harm in it, so I'm leaning towards accept here. Aaron can be the final approval though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't have time to review this closely until next week.
I think the 'inside profile' is a red herring here, this looks more like a problem from calling it from within deserialization.
We could try getting to the bottom of it and see if we can assert in the problematic cases, to avoid future repeats of the problem.
But the fix here is fine on its own, we don't need the most recent decl in most cases, and we kept one getter only for simplicity.
Though I think the name of the new getter needs more thought, it needs to avoid confusion with similar methods from the base type class.
Though I won't have time to give input on that until the weekend at least.
@AaronBallman I'd like to proceed with landing this as a short-term fix for the nondeterminism issue we're hitting at Google, but if you have any additional feedback I'm happy to address it in a follow-up. (We're also very interested in investigating the actual underlying issues that lead to these subtle problems, but it'll probably take much more time.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine as a temporary hack, but please add a FIXME comment explaining that this is working around a deeper underlying issue.
Sure, done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to land this and unblock our internal releases.
Obviously, I encourage to address the issues raised by other reviewers in the follow-ups.
In particular, the most easily actionable one is about the naming of the getter from @mizvekov . Let's wait for the comments and fix that quickly if a better name is suggested.
Yes, I'll upload a PR for the rename immediately afterwards. Proceeding with the PR so that the particular nondeterminism issue we've been hitting is mitigated on tip-of-the-tree. |
Thanks everyone! |
This commit fixes the nondeterminism issue in C++ header module enabled builds which were observed after llvm#132401. The issue was related to the fact that the hash set operation in MemberPointerType::Profile() was triggering getMostRecentDecl(). As the latter may trigger the loading of new entities from the external AST source, this was presumably causing reentrant modification of data structure or some other issue that affects compiler's output in a nondeterministic way (likely depending on specific values hashes/pointers have). The change should otherwise be a no-op, because whether we take a "most recent" or "any" Decl shouldn't matter since `getCanonicalDecl()` is called on it anyway inside `MemberPointerType::Profile()`. We haven't been able to come up with a deterministic regression test for this fix.
This commit fixes the nondeterminism issue in C++ header module enabled builds which were observed after llvm#132401. The issue was related to the fact that the hash set operation in MemberPointerType::Profile() was triggering getMostRecentDecl(). As the latter may trigger the loading of new entities from the external AST source, this was presumably causing reentrant modification of data structure or some other issue that affects compiler's output in a nondeterministic way (likely depending on specific values hashes/pointers have). The change should otherwise be a no-op, because whether we take a "most recent" or "any" Decl shouldn't matter since `getCanonicalDecl()` is called on it anyway inside `MemberPointerType::Profile()`. We haven't been able to come up with a deterministic regression test for this fix.
This commit fixes the nondeterminism issue in C++ header module enabled builds which were observed after llvm#132401. The issue was related to the fact that the hash set operation in MemberPointerType::Profile() was triggering getMostRecentDecl(). As the latter may trigger the loading of new entities from the external AST source, this was presumably causing reentrant modification of data structure or some other issue that affects compiler's output in a nondeterministic way (likely depending on specific values hashes/pointers have). The change should otherwise be a no-op, because whether we take a "most recent" or "any" Decl shouldn't matter since `getCanonicalDecl()` is called on it anyway inside `MemberPointerType::Profile()`. We haven't been able to come up with a deterministic regression test for this fix.
This commit fixes the nondeterminism issue in C++ header module enabled builds which were observed after llvm#132401. The issue was related to the fact that the hash set operation in MemberPointerType::Profile() was triggering getMostRecentDecl(). As the latter may trigger the loading of new entities from the external AST source, this was presumably causing reentrant modification of data structure or some other issue that affects compiler's output in a nondeterministic way (likely depending on specific values hashes/pointers have). The change should otherwise be a no-op, because whether we take a "most recent" or "any" Decl shouldn't matter since `getCanonicalDecl()` is called on it anyway inside `MemberPointerType::Profile()`. We haven't been able to come up with a deterministic regression test for this fix.
This commit fixes the nondeterminism issue in C++ header module enabled builds which were observed after
#132401.
The issue was related to the fact that the hash set operation in MemberPointerType::Profile() was triggering getMostRecentDecl(). As the latter may trigger the loading of new entities from the external AST source, this
was presumably causing reentrant modification of data structure or some other issue that affects
compiler's output in a nondeterministic way (likely depending on specific values hashes/pointers have).
The change should otherwise be a no-op, because whether we take a "most recent" or "any" Decl shouldn't
matter since
getCanonicalDecl()
is called on it anyway insideMemberPointerType::Profile()
.We haven't been able to come up with a deterministic regression test for this fix.