Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get operator type for CXCursor_BinaryOperator

I'm trying to find an assignment in C++ source file:

x = 10;

I'm using libclang to parse it and traverse AST. There is an CXCursor_BinaryOperator that represents binary operators. Is there a way to determine whether it is an assignment or any other binary operator (like + or <= or !=)? If not then how can I determine if the expression is an assignment or not?

Thnks in advance.

like image 697
maverik Avatar asked Oct 26 '25 06:10

maverik


2 Answers

For Clang 17 or later

As of Clang 17, one can use clang_getCursorBinaryOperatorKind() to get this information.

Historical information: The issue was first filed as issue 29138 in 2016. It was fixed in commit 7fbc9de455, which flowed into Clang 17.0.1, which was released on 2023-09-09.

For Clang 16 and earlier

(This was the original answer, and is still applicable in Clang 17 as a demonstration of how to get tokens, although it is not needed anymore for the specific task of getting the operator of a binary expression.)

The answer by @notetau simply searches for any token with text =, but that fails when that token appears somewhere in the expression other than at the top level.

Here is a version that gets the text of the first token after all tokens in the left-hand operand:

// Get the first child of 'cxNode'.
static CXCursor getFirstChild(CXCursor cxNode)
{
  struct Result {
    CXCursor child;
    bool found;
  } result;
  result.found = false;

  clang_visitChildren(cxNode,
    [](CXCursor c, CXCursor parent, CXClientData client_data) {
      Result *r = (Result*)client_data;
      r->found = true;
      r->child = c;
      return CXChildVisit_Break;
    },
    &result);

  assert(result.found);
  return result.child;
}

// Get the operator of binary expression 'cxExpr' as a string.
std::string getBinaryOperator(CXTranslationUnit cxTU, CXCursor cxExpr)
{
  // Get tokens in 'cxExpr'.
  CXToken *exprTokens;
  unsigned numExprTokens;
  clang_tokenize(cxTU, clang_getCursorExtent(cxExpr),
    &exprTokens, &numExprTokens);

  // Get tokens in its left-hand side.
  CXCursor cxLHS = getFirstChild(cxExpr);
  CXToken *lhsTokens;
  unsigned numLHSTokens;
  clang_tokenize(cxTU, clang_getCursorExtent(cxLHS),
    &lhsTokens, &numLHSTokens);

  // Get the spelling of the first token not in the LHS.
  assert(numLHSTokens < numExprTokens);
  CXString cxString = clang_getTokenSpelling(cxTU,
    exprTokens[numLHSTokens]);
  std::string ret(clang_getCString(cxString));

  // Clean up.
  clang_disposeString(cxString);
  clang_disposeTokens(cxTU, lhsTokens, numLHSTokens);
  clang_disposeTokens(cxTU, exprTokens, numExprTokens);

  return ret;
}

However, even this fails in some cases where macros are involved, for example:

#define MINUS -
int f(int a, int b)
{
  return a MINUS b;
}

For this code, getBinaryOperator will return MINUS, and I haven't found any solution to that problem other than to do preprocessing first, as a separate step, and then pass the preprocessed output to clang for further analysis.

like image 53
Scott McPeak Avatar answered Oct 28 '25 21:10

Scott McPeak


The following code may work for you:

  CXToken *tokens;
  unsigned numTokens;
  CXSourceRange range = clang_getCursorExtent(cursor);
  clang_tokenize(tu, range, &tokens, &numTokens);
  for(unsigned i=0; i<numTokens; i++) {
    CXString s = clang_getTokenSpelling(tu, tokens[i]);
    const char* str = clang_getCString(s);
    if( strcmp(str, "=") == 0 ) {
      /* found */
    }
    clang_disposeString(s);
  }
  clang_disposeTokens(tu, tokens, numTokens);
like image 43
notetau Avatar answered Oct 28 '25 21:10

notetau