I am compiling C++ code with Clang. (Apple clang version 12.0.5 (clang-1205.0.22.11)
).
Clang can give tips in case you misspell a variable:
#include <iostream>
int main() {
int my_int;
std::cout << my_it << std::endl;
}
spellcheck-test.cpp:5:18: error: use of undeclared identifier 'my_it'; did you mean 'my_int'?
std::cout << my_it << std::endl;
^~~~~
my_int
spellcheck-test.cpp:4:9: note: 'my_int' declared here
int my_int;
^
1 error generated.
My question is:
What is the criterion Clang uses to determine when to suggest another variable?
My experimentation suggests it is quite sophisticated:
int my_in;
) it does not give a suggestionmy_it.size()
instead) it does not give a suggestionYou will not likely find it documented, but as Clang is open-source you can turn to the source to try to figure it out.
The particular diagnostic (from DiagnosticSemaKinds.td
):
def err_undeclared_var_use_suggest : Error< "use of undeclared identifier %0; did you mean %1?">;
is ever only referred to from clang-tools-extra/clangd/IncludeFixer.cpp
:
// Try to fix unresolved name caused by missing declaration. // E.g. // clang::SourceManager SM; // ~~~~~~~~~~~~~ // UnresolvedName // or // namespace clang { SourceManager SM; } // ~~~~~~~~~~~~~ // UnresolvedName // We only attempt to recover a diagnostic if it has the same location as // the last seen unresolved name. if (DiagLevel >= DiagnosticsEngine::Error && LastUnresolvedName->Loc == Info.getLocation()) return fixUnresolvedName();
Now, clangd is a language server and t.b.h. I don't know how whether this is actually used by the Clang compiler frontend to yield certain diagnostics, but you're free to continue down the rabbit hole to tie together these details. The fixUnresolvedName
above eventually performs a fuzzy search:
if (llvm::Optional<const SymbolSlab *> Syms = fuzzyFindCached(Req)) return fixesForSymbols(**Syms);
If you want to dig into the details, I would recommend starting with the fuzzyFindCached
function:
llvm::Optional<const SymbolSlab *> IncludeFixer::fuzzyFindCached(const FuzzyFindRequest &Req) const { auto ReqStr = llvm::formatv("{0}", toJSON(Req)).str(); auto I = FuzzyFindCache.find(ReqStr); if (I != FuzzyFindCache.end()) return &I->second; if (IndexRequestCount >= IndexRequestLimit) return llvm::None; IndexRequestCount++; SymbolSlab::Builder Matches; Index.fuzzyFind(Req, [&](const Symbol &Sym) { if (Sym.Name != Req.Query) return; if (!Sym.IncludeHeaders.empty()) Matches.insert(Sym); }); auto Syms = std::move(Matches).build(); auto E = FuzzyFindCache.try_emplace(ReqStr, std::move(Syms)); return &E.first->second; }
along with the type of its single function parameter, FuzzyFindRequest
in clang/index/Index.h
:
struct FuzzyFindRequest { /// A query string for the fuzzy find. This is matched against symbols' /// un-qualified identifiers and should not contain qualifiers like "::". std::string Query; /// If this is non-empty, symbols must be in at least one of the scopes /// (e.g. namespaces) excluding nested scopes. For example, if a scope "xyz::" /// is provided, the matched symbols must be defined in namespace xyz but not /// namespace xyz::abc. /// /// The global scope is "", a top level scope is "foo::", etc. std::vector<std::string> Scopes; /// If set to true, allow symbols from any scope. Scopes explicitly listed /// above will be ranked higher. bool AnyScope = false; /// The number of top candidates to return. The index may choose to /// return more than this, e.g. if it doesn't know which candidates are best. llvm::Optional<uint32_t> Limit; /// If set to true, only symbols for completion support will be considered. bool RestrictForCodeCompletion = false; /// Contextually relevant files (e.g. the file we're code-completing in). /// Paths should be absolute. std::vector<std::string> ProximityPaths; /// Preferred types of symbols. These are raw representation of `OpaqueType`. std::vector<std::string> PreferredTypes; bool operator==(const FuzzyFindRequest &Req) const { return std::tie(Query, Scopes, Limit, RestrictForCodeCompletion, ProximityPaths, PreferredTypes) == std::tie(Req.Query, Req.Scopes, Req.Limit, Req.RestrictForCodeCompletion, Req.ProximityPaths, Req.PreferredTypes); } bool operator!=(const FuzzyFindRequest &Req) const { return !(*this == Req); } };
The following commit may be another leg to start from:
[Frontend] Allow attaching an external sema source to compiler instance and extra diags to TypoCorrections
This can be used to append alternative typo corrections to an existing diag. include-fixer can use it to suggest includes to be added.
Differential Revision: https://reviews.llvm.org/D26745
from which we may end up in clang/include/clang/Sema/TypoCorrection.h
, which sounds like a more reasonably used feature by the compiler frontend than that of the (clang extra tool) clangd. E.g.:
/// Gets the "edit distance" of the typo correction from the typo. /// If Normalized is true, scale the distance down by the CharDistanceWeight /// to return the edit distance in terms of single-character edits. unsigned getEditDistance(bool Normalized = true) const { if (CharDistance > MaximumDistance || QualifierDistance > MaximumDistance || CallbackDistance > MaximumDistance) return InvalidDistance; unsigned ED = CharDistance * CharDistanceWeight + QualifierDistance * QualifierDistanceWeight + CallbackDistance * CallbackDistanceWeight; if (ED > MaximumDistance) return InvalidDistance; // Half the CharDistanceWeight is added to ED to simulate rounding since // integer division truncates the value (i.e. round-to-nearest-int instead // of round-to-zero). return Normalized ? NormalizeEditDistance(ED) : ED; }
used in clang/lib/Sema/SemaDecl.cpp
:
// Callback to only accept typo corrections that have a non-zero edit distance. // Also only accept corrections that have the same parent decl. class DifferentNameValidatorCCC final : public CorrectionCandidateCallback { public: DifferentNameValidatorCCC(ASTContext &Context, FunctionDecl *TypoFD, CXXRecordDecl *Parent) : Context(Context), OriginalFD(TypoFD), ExpectedParent(Parent ? Parent->getCanonicalDecl() : nullptr) {} bool ValidateCandidate(const TypoCorrection &candidate) override { if (candidate.getEditDistance() == 0) return false; // ... } // ... };
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With