Question

I'm trying to find an assignment in C++ source file:

x = 10;

I'm using libclang to parse it and traverse AST. There is an CXCursor_BinaryOperator that represents binary operators. Is there a way to determine whether it is an assignment or any other binary operator (like + or <= or !=)? If not then how can I determine if the expression is an assignment or not?

Thnks in advance.

Was it helpful?

Solution

The following code may work for you:

  CXToken *tokens;
  unsigned numTokens;
  CXSourceRange range = clang_getCursorExtent(cursor);
  clang_tokenize(tu, range, &tokens, &numTokens);
  for(unsigned i=0; i<numTokens; i++) {
    CXString s = clang_getTokenSpelling(tu, tokens[i]);
    const char* str = clang_getCString(s);
    if( strcmp(str, "=") == 0 ) {
      /* found */
    }
    clang_disposeString(s);
  }
  clang_disposeTokens(tu, tokens, numTokens);

OTHER TIPS

The answer by @notetau simply searches for any token with text =, but that fails when that token appears somewhere in the expression other than at the top level.

Here is a version that gets the text of the first token after all tokens in the left-hand operand:

// Get the first child of 'cxNode'.
static CXCursor getFirstChild(CXCursor cxNode)
{
  struct Result {
    CXCursor child;
    bool found;
  } result;
  result.found = false;

  clang_visitChildren(cxNode,
    [](CXCursor c, CXCursor parent, CXClientData client_data) {
      Result *r = (Result*)client_data;
      r->found = true;
      r->child = c;
      return CXChildVisit_Break;
    },
    &result);

  assert(result.found);
  return result.child;
}

// Get the operator of binary expression 'cxExpr' as a string.
std::string getBinaryOperator(CXTranslationUnit cxTU, CXCursor cxExpr)
{
  // Get tokens in 'cxExpr'.
  CXToken *exprTokens;
  unsigned numExprTokens;
  clang_tokenize(cxTU, clang_getCursorExtent(cxExpr),
    &exprTokens, &numExprTokens);

  // Get tokens in its left-hand side.
  CXCursor cxLHS = getFirstChild(cxExpr);
  CXToken *lhsTokens;
  unsigned numLHSTokens;
  clang_tokenize(cxTU, clang_getCursorExtent(cxLHS),
    &lhsTokens, &numLHSTokens);

  // Get the spelling of the first token not in the LHS.
  assert(numLHSTokens < numExprTokens);
  CXString cxString = clang_getTokenSpelling(cxTU,
    exprTokens[numLHSTokens]);
  std::string ret(clang_getCString(cxString));

  // Clean up.
  clang_disposeString(cxString);
  clang_disposeTokens(cxTU, lhsTokens, numLHSTokens);
  clang_disposeTokens(cxTU, exprTokens, numExprTokens);

  return ret;
}

However, even this fails in some cases where macros are involved, for example:

#define MINUS -
int f(int a, int b)
{
  return a MINUS b;
}

For this code, getBinaryOperator will return MINUS, and I haven't found any solution to that problem other than to do preprocessing first, as a separate step, and then pass the preprocessed output to clang for further analysis.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top