-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[clang-tidy] Switch misc-confusable-identifiers check to a faster algorithm. #130369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[clang-tidy] Switch misc-confusable-identifiers check to a faster algorithm. #130369
Conversation
- Only build the skeleton for each identifier once. - Only compute the contexts in which identifiers are declared for identifiers that actually have collisions. - Only compare pairs of declarations that are declared in related contexts, rather than comparing all pairs of declarations with the same skeleton. This also fixes several bugs: - The old check skipped comparisons of declarations from different contexts unless both declarations were type template parameters. - The old check skipped comparisons of declarations in all base classes other than the first one found by the traversal. This decreases the runtime of this check, especially in the common case where there are few or no skeletons with two or more different identifiers. From a sample invocation: ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.1556 ( 1.4%) 0.1513 ( 1.8%) 0.3069 ( 1.5%) 0.3033 ( 1.5%) before 0.0746 ( 0.7%) 0.0751 ( 0.8%) 0.1498 ( 0.8%) 0.1491 ( 0.8%) after ```
@llvm/pr-subscribers-clang-tools-extra @llvm/pr-subscribers-clang-tidy Author: Richard Smith (zygoloid) ChangesOptimizations:
Also simplify by removing the caching of enclosing This also fixes two bugs that together caused the check to fail to find the issues it was looking for in most cases:
This decreases the runtime of this check, especially in the common case where there are few or no skeletons with two or more different identifiers. From a sample invocation:
Full diff: https://github.com/llvm/llvm-project/pull/130369.diff 3 Files Affected:
diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
index 6df565c9a9d69..70b948c0784db 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
@@ -1,5 +1,4 @@
-//===--- ConfusableIdentifierCheck.cpp -
-// clang-tidy--------------------------===//
+//===--- ConfusableIdentifierCheck.cpp - clang-tidy -----------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -89,90 +88,56 @@ static llvm::SmallString<64U> skeleton(StringRef Name) {
return Skeleton;
}
-static bool mayShadowImpl(const DeclContext *DC0, const DeclContext *DC1) {
- return DC0 && DC0 == DC1;
-}
-
-static bool mayShadowImpl(const NamedDecl *ND0, const NamedDecl *ND1) {
- return isa<TemplateTypeParmDecl>(ND0) || isa<TemplateTypeParmDecl>(ND1);
-}
-
-static bool isMemberOf(const ConfusableIdentifierCheck::ContextInfo *DC0,
- const ConfusableIdentifierCheck::ContextInfo *DC1) {
- return llvm::is_contained(DC1->Bases, DC0->PrimaryContext);
-}
-
-static bool enclosesContext(const ConfusableIdentifierCheck::ContextInfo *DC0,
- const ConfusableIdentifierCheck::ContextInfo *DC1) {
- if (DC0->PrimaryContext == DC1->PrimaryContext)
- return true;
-
- return llvm::is_contained(DC0->PrimaryContexts, DC1->PrimaryContext) ||
- llvm::is_contained(DC1->PrimaryContexts, DC0->PrimaryContext);
+namespace {
+struct Entry {
+ const NamedDecl *ND;
+ bool FromDerivedClass;
+};
}
-static bool mayShadow(const NamedDecl *ND0,
- const ConfusableIdentifierCheck::ContextInfo *DC0,
- const NamedDecl *ND1,
- const ConfusableIdentifierCheck::ContextInfo *DC1) {
-
- if (!DC0->Bases.empty() && !DC1->Bases.empty()) {
- // if any of the declaration is a non-private member of the other
- // declaration, it's shadowed by the former
-
- if (ND1->getAccess() != AS_private && isMemberOf(DC1, DC0))
- return true;
-
- if (ND0->getAccess() != AS_private && isMemberOf(DC0, DC1))
+using DeclsWithinContextMap =
+ llvm::DenseMap<const DeclContext *, llvm::SmallVector<Entry, 1>>;
+
+static bool addToContext(DeclsWithinContextMap &DeclsWithinContext,
+ const DeclContext *DC, Entry E) {
+ auto &Decls = DeclsWithinContext[DC];
+ if (!Decls.empty() &&
+ Decls.back().ND->getIdentifier() == E.ND->getIdentifier()) {
+ // Already have a declaration with this identifier in this context. Don't
+ // track another one. This means that if an outer name is confusable with an
+ // inner name, we'll only diagnose the outer name once, pointing at the
+ // first inner declaration with that name.
+ if (Decls.back().FromDerivedClass && !E.FromDerivedClass) {
+ // Prefer the declaration that's not from the derived class, because that
+ // conflicts with more declarations.
+ Decls.back() = E;
return true;
- }
-
- if (!mayShadowImpl(DC0->NonTransparentContext, DC1->NonTransparentContext) &&
- !mayShadowImpl(ND0, ND1))
+ }
return false;
-
- return enclosesContext(DC0, DC1);
-}
-
-const ConfusableIdentifierCheck::ContextInfo *
-ConfusableIdentifierCheck::getContextInfo(const DeclContext *DC) {
- const DeclContext *PrimaryContext = DC->getPrimaryContext();
- auto [It, Inserted] = ContextInfos.try_emplace(PrimaryContext);
- if (!Inserted)
- return &It->second;
-
- ContextInfo &Info = It->second;
- Info.PrimaryContext = PrimaryContext;
- Info.NonTransparentContext = PrimaryContext;
-
- while (Info.NonTransparentContext->isTransparentContext()) {
- Info.NonTransparentContext = Info.NonTransparentContext->getParent();
- if (!Info.NonTransparentContext)
- break;
}
+ Decls.push_back(E);
+ return true;
+}
- if (Info.NonTransparentContext)
- Info.NonTransparentContext =
- Info.NonTransparentContext->getPrimaryContext();
-
+static void addToEnclosingContexts(DeclsWithinContextMap &DeclsWithinContext,
+ const DeclContext *DC, const NamedDecl *ND) {
while (DC) {
- if (!isa<LinkageSpecDecl>(DC) && !isa<ExportDecl>(DC))
- Info.PrimaryContexts.push_back(DC->getPrimaryContext());
- DC = DC->getParent();
- }
-
- if (const auto *RD = dyn_cast<CXXRecordDecl>(PrimaryContext)) {
- RD = RD->getDefinition();
- if (RD) {
- Info.Bases.push_back(RD);
- RD->forallBases([&](const CXXRecordDecl *Base) {
- Info.Bases.push_back(Base);
- return false;
- });
+ DC = DC->getNonTransparentContext()->getPrimaryContext();
+ if (!addToContext(DeclsWithinContext, DC, {ND, false}))
+ return;
+
+ if (const auto *RD = dyn_cast<CXXRecordDecl>(DC)) {
+ RD = RD->getDefinition();
+ if (RD) {
+ RD->forallBases([&](const CXXRecordDecl *Base) {
+ addToContext(DeclsWithinContext, Base, {ND, true});
+ return true;
+ });
+ }
}
- }
- return &Info;
+ DC = DC->getParent();
+ }
}
void ConfusableIdentifierCheck::check(
@@ -181,7 +146,7 @@ void ConfusableIdentifierCheck::check(
if (!ND)
return;
- IdentifierInfo *NDII = ND->getIdentifier();
+ const IdentifierInfo *NDII = ND->getIdentifier();
if (!NDII)
return;
@@ -189,29 +154,68 @@ void ConfusableIdentifierCheck::check(
if (NDName.empty())
return;
- const ContextInfo *Info = getContextInfo(ND->getDeclContext());
+ NameToDecls[NDII].push_back(ND);
+}
- llvm::SmallVector<Entry> &Mapped = Mapper[skeleton(NDName)];
- for (const Entry &E : Mapped) {
- if (!mayShadow(ND, Info, E.Declaration, E.Info))
- continue;
+void ConfusableIdentifierCheck::onEndOfTranslationUnit() {
+ llvm::StringMap<llvm::SmallVector<const IdentifierInfo*, 1>> SkeletonToNames;
+ // Compute the skeleton for each identifier.
+ for (auto &[Ident, Decls] : NameToDecls) {
+ SkeletonToNames[skeleton(Ident->getName())].push_back(Ident);
+ }
- const IdentifierInfo *ONDII = E.Declaration->getIdentifier();
- StringRef ONDName = ONDII->getName();
- if (ONDName == NDName)
+ // Visit each skeleton with more than one identifier.
+ for (auto &[Skel, Idents] : SkeletonToNames) {
+ if (Idents.size() < 2) {
continue;
+ }
- diag(ND->getLocation(), "%0 is confusable with %1") << ND << E.Declaration;
- diag(E.Declaration->getLocation(), "other declaration found here",
- DiagnosticIDs::Note);
- }
+ // Find the declaration contexts that transitively contain each identifier.
+ DeclsWithinContextMap DeclsWithinContext;
+ for (const IdentifierInfo *II : Idents) {
+ for (const NamedDecl *ND : NameToDecls[II]) {
+ addToEnclosingContexts(DeclsWithinContext, ND->getDeclContext(), ND);
+ }
+ }
- Mapped.push_back({ND, Info});
-}
+ // Check to see if any declaration is declared in a context that
+ // transitively contains another declaration with a different identifier but
+ // the same skeleton.
+ for (const IdentifierInfo *II : Idents) {
+ for (const NamedDecl *OuterND : NameToDecls[II]) {
+ const DeclContext *OuterDC = OuterND->getDeclContext()
+ ->getNonTransparentContext()
+ ->getPrimaryContext();
+ for (Entry Inner : DeclsWithinContext[OuterDC]) {
+ // Don't complain if the identifiers are the same.
+ if (OuterND->getIdentifier() == Inner.ND->getIdentifier())
+ continue;
+
+ // Don't complain about a derived-class name shadowing a base class
+ // private member.
+ if (OuterND->getAccess() == AS_private && Inner.FromDerivedClass)
+ continue;
+
+ // If the declarations are in the same context, only diagnose the
+ // later one.
+ if (OuterDC->Equals(
+ Inner.ND->getDeclContext()->getNonTransparentContext()) &&
+ Inner.ND->getASTContext()
+ .getSourceManager()
+ .isBeforeInTranslationUnit(Inner.ND->getLocation(),
+ OuterND->getLocation()))
+ continue;
+
+ diag(Inner.ND->getLocation(), "%0 is confusable with %1")
+ << Inner.ND << OuterND;
+ diag(OuterND->getLocation(), "other declaration found here",
+ DiagnosticIDs::Note);
+ }
+ }
+ }
+ }
-void ConfusableIdentifierCheck::onEndOfTranslationUnit() {
- Mapper.clear();
- ContextInfos.clear();
+ NameToDecls.clear();
}
void ConfusableIdentifierCheck::registerMatchers(
diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
index f3b0c8ed00306..65669fb61961a 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h
@@ -1,5 +1,4 @@
-//===--- ConfusableIdentifierCheck.h - clang-tidy
-//-------------------------------*- C++ -*-===//
+//===--- ConfusableIdentifierCheck.h - clang-tidy ---------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -11,7 +10,7 @@
#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_MISC_CONFUSABLE_IDENTIFIER_CHECK_H
#include "../ClangTidyCheck.h"
-#include <unordered_map>
+#include "llvm/ADT/DenseMap.h"
namespace clang::tidy::misc {
@@ -31,23 +30,10 @@ class ConfusableIdentifierCheck : public ClangTidyCheck {
return TK_IgnoreUnlessSpelledInSource;
}
- struct ContextInfo {
- const DeclContext *PrimaryContext;
- const DeclContext *NonTransparentContext;
- llvm::SmallVector<const DeclContext *> PrimaryContexts;
- llvm::SmallVector<const CXXRecordDecl *> Bases;
- };
-
private:
- struct Entry {
- const NamedDecl *Declaration;
- const ContextInfo *Info;
- };
-
- const ContextInfo *getContextInfo(const DeclContext *DC);
-
- llvm::StringMap<llvm::SmallVector<Entry>> Mapper;
- std::unordered_map<const DeclContext *, ContextInfo> ContextInfos;
+ llvm::DenseMap<const IdentifierInfo *,
+ llvm::SmallVector<const NamedDecl *, 1>>
+ NameToDecls;
};
} // namespace clang::tidy::misc
diff --git a/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp b/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp
index cdfed7edb431d..acaf39973961d 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp
+++ b/clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp
@@ -74,6 +74,19 @@ template <typename t1, typename tl>
// CHECK-MESSAGES: :[[#@LINE-2]]:20: note: other declaration found here
void f9();
+namespace f10 {
+int il;
+namespace inner {
+ int i1;
+ // CHECK-MESSAGES: :[[#@LINE-1]]:7: warning: 'i1' is confusable with 'il' [misc-confusable-identifiers]
+ // CHECK-MESSAGES: :[[#@LINE-4]]:5: note: other declaration found here
+ int j1;
+ // CHECK-MESSAGES: :[[#@LINE-1]]:7: warning: 'j1' is confusable with 'jl' [misc-confusable-identifiers]
+ // CHECK-MESSAGES: :[[#@LINE+2]]:5: note: other declaration found here
+}
+int jl;
+}
+
struct Base0 {
virtual void mO0();
@@ -103,3 +116,21 @@ struct Derived1 : Base1 {
long mI1(); // no warning: mII is private
};
+
+struct Base2 {
+ long nO0;
+
+private:
+ long nII;
+};
+
+struct Mid2 : Base0, Base1, Base2 {
+};
+
+struct Derived2 : Mid2 {
+ long nOO;
+ // CHECK-MESSAGES: :[[#@LINE-1]]:8: warning: 'nOO' is confusable with 'nO0' [misc-confusable-identifiers]
+ // CHECK-MESSAGES: :[[#@LINE-12]]:8: note: other declaration found here
+
+ long nI1(); // no warning: mII is private
+};
|
You can test this locally with the following command:git-clang-format --diff 1b75b9e665ee3c43de85c25f8d5f10d4efb3ca39 b7accb2fa7e66afaa81c5ad00bfe855f4b32a9c4 --extensions h,cpp -- clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.h clang-tools-extra/test/clang-tidy/checkers/misc/confusable-identifiers.cpp View the diff from clang-format here.diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
index 6793135b51..cfdcc62dad 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableIdentifierCheck.cpp
@@ -127,8 +127,7 @@ static bool addToContext(DeclsWithinContextMap &DeclsWithinContext,
}
static void addToEnclosingContexts(DeclsWithinContextMap &DeclsWithinContext,
- const Decl *Parent,
- const NamedDecl *ND) {
+ const Decl *Parent, const NamedDecl *ND) {
const Decl *Outer = Parent;
while (Outer) {
if (const auto *NS = dyn_cast<NamespaceDecl>(Outer))
@@ -160,8 +159,8 @@ void ConfusableIdentifierCheck::check(
if (!ND)
return;
- addDeclToCheck(ND, cast<Decl>(ND->getDeclContext()
- ->getNonTransparentContext()));
+ addDeclToCheck(ND,
+ cast<Decl>(ND->getDeclContext()->getNonTransparentContext()));
// Associate template parameters with this declaration of this template.
if (const auto *TD = dyn_cast<TemplateDecl>(ND)) {
@@ -255,10 +254,9 @@ void ConfusableIdentifierCheck::registerMatchers(
ast_matchers::parmVarDecl(), ast_matchers::templateTypeParmDecl(),
ast_matchers::nonTypeTemplateParmDecl(),
ast_matchers::templateTemplateParmDecl());
- Finder->addMatcher(
- ast_matchers::namedDecl(ast_matchers::unless(AnyParamDecl))
- .bind("nameddecl"),
- this);
+ Finder->addMatcher(ast_matchers::namedDecl(ast_matchers::unless(AnyParamDecl))
+ .bind("nameddecl"),
+ this);
}
} // namespace clang::tidy::misc
|
Could you run this check on for example llvm code base, and capture times before and after using build-in clang-tidy profiler ? |
Done and added results to commit message. Wall time 5202.86s -> 3900.90s. |
Perfect, do one more test, compare findings before and after from for example llvm, if they catch same thing. As for a change, I will try to review it this week. |
Looking at the warnings across LLVM I found and fixed a few bugs. With that done: Warnings before: https://gist.github.com/zygoloid/34584fae8789977d2032bd98549aefc8 I turned off header filtering to get the full set of diagnostics here. Note that the before warnings have a lot of false positives, which unfortunately get repeated a lot due to the lack of filtering. |
With default filtering enabled for Before: https://gist.github.com/zygoloid/e06c7d6cb7309a5e3d38e9b2b4e45659 |
I've done some point comparisons between the old and new output. I found:
|
This comment was marked as outdated.
This comment was marked as outdated.
2 similar comments
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Ping x6 |
Optimizations:
Also simplify by removing the caching of enclosing
DeclContext
sets, because with the above changes we don't even compute the enclosingDeclContext
sets in common cases. Instead, we terminate the traversal to enclosingDeclContext
s immediately if we've already found another declaration in that context with the same identifier. (This optimization is not currently applied to theforallBases
traversal, but could be applied there too if needed.)This also fixes two bugs that together caused the check to fail to find some of the issues it was looking for:
false
rather thantrue
from theforallBases
callback, which terminates traversal.This also fixes an issue where the check would have false positives for template parameters and function parameters in some cases, because those parameters sometimes have a parent
DeclContext
that is the parent of the parameterized entity, or sometimes is the translation unit. In either case, this would cause warnings about declarations that are never visible together in any scope.This decreases the runtime of this check, especially in the common case where there are few or no skeletons with two or more different identifiers. Running this check over LLVM, clang, and clang-tidy, the wall time for the check as reported by clang-tidy's internal profiler is reduced from 5202.86s to 3900.90s.