I'm trying to find an assignment in C++ source file:
x = 10;
I'm using libclang to parse it and traverse AST. There is an CXCursor_BinaryOperator that represents binary operators. Is there a way to determine whether it is an assignment or any other binary operator (like + or <= or !=)? If not then how can I determine if the expression is an assignment or not?
Thnks in advance.
As of Clang 17, one can use clang_getCursorBinaryOperatorKind() to get this information.
Historical information: The issue was first filed as issue 29138 in 2016. It was fixed in commit 7fbc9de455, which flowed into Clang 17.0.1, which was released on 2023-09-09.
(This was the original answer, and is still applicable in Clang 17 as a demonstration of how to get tokens, although it is not needed anymore for the specific task of getting the operator of a binary expression.)
The answer by @notetau simply searches for any token with text =, but that fails when that token appears somewhere in the expression other than at the top level.
Here is a version that gets the text of the first token after all tokens in the left-hand operand:
// Get the first child of 'cxNode'.
static CXCursor getFirstChild(CXCursor cxNode)
{
struct Result {
CXCursor child;
bool found;
} result;
result.found = false;
clang_visitChildren(cxNode,
[](CXCursor c, CXCursor parent, CXClientData client_data) {
Result *r = (Result*)client_data;
r->found = true;
r->child = c;
return CXChildVisit_Break;
},
&result);
assert(result.found);
return result.child;
}
// Get the operator of binary expression 'cxExpr' as a string.
std::string getBinaryOperator(CXTranslationUnit cxTU, CXCursor cxExpr)
{
// Get tokens in 'cxExpr'.
CXToken *exprTokens;
unsigned numExprTokens;
clang_tokenize(cxTU, clang_getCursorExtent(cxExpr),
&exprTokens, &numExprTokens);
// Get tokens in its left-hand side.
CXCursor cxLHS = getFirstChild(cxExpr);
CXToken *lhsTokens;
unsigned numLHSTokens;
clang_tokenize(cxTU, clang_getCursorExtent(cxLHS),
&lhsTokens, &numLHSTokens);
// Get the spelling of the first token not in the LHS.
assert(numLHSTokens < numExprTokens);
CXString cxString = clang_getTokenSpelling(cxTU,
exprTokens[numLHSTokens]);
std::string ret(clang_getCString(cxString));
// Clean up.
clang_disposeString(cxString);
clang_disposeTokens(cxTU, lhsTokens, numLHSTokens);
clang_disposeTokens(cxTU, exprTokens, numExprTokens);
return ret;
}
However, even this fails in some cases where macros are involved, for example:
#define MINUS -
int f(int a, int b)
{
return a MINUS b;
}
For this code, getBinaryOperator will return MINUS, and I haven't found any solution to that problem other than to do preprocessing first, as a separate step, and then pass the preprocessed output to clang for further analysis.
The following code may work for you:
CXToken *tokens;
unsigned numTokens;
CXSourceRange range = clang_getCursorExtent(cursor);
clang_tokenize(tu, range, &tokens, &numTokens);
for(unsigned i=0; i<numTokens; i++) {
CXString s = clang_getTokenSpelling(tu, tokens[i]);
const char* str = clang_getCString(s);
if( strcmp(str, "=") == 0 ) {
/* found */
}
clang_disposeString(s);
}
clang_disposeTokens(tu, tokens, numTokens);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With