Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clang : What does AST (abstract syntax tree) look like?

Hi I am new in Compiler development, and am wondering how AST look like. I have a small section of code, and I use Clang for generating the AST. I don't get much information out of it. From the looks of it, the Syntax tree is exactly the same as the source, except for one struct that is added to almost any sample I test with.

Source:

class A {
public:
  int *a, *b, *c;
  int i;
  void sum() {
    a = new int[5];
    b = new int[5];
    c = new int[5];
    for (i = 0; i < 5; i++) {
      a[i] = i;
      b[i] = i;
    }
    for (i = 0; i < 5; i++) {
      c[i] = a[i] + b[i];
    }
    delete[] a;   delete[] b;   delete[] c;
  }
};

class B : public A {
};

int main() {
  B bclass; 
  bclass.sum();
  return 0;
} 

Command to generate AST:

clang++ -cc1 -ast-print ~/sum.cpp

AST output:

struct __va_list_tag {
    unsigned int gp_offset;
    unsigned int fp_offset;
    void *overflow_arg_area;
    void *reg_save_area;
};
typedef struct __va_list_tag __va_list_tag;
class A {
public:
    int *a;
    int *b;
    int *c;
    int i;
    void sum()     {
        this->a = new int [5];
        this->b = new int [5];
        this->c = new int [5];
        for (this->i = 0; this->i < 5; this->i++) {
            this->a[this->i] = this->i;
            this->b[this->i] = this->i;
        }
        for (this->i = 0; this->i < 5; this->i++) {
            this->c[this->i] = this->a[this->i] + this->b[this->i];
        }
        delete [] this->a;
        delete [] this->b;
        delete [] this->c;
    }


};
class B : public A {
};
int main() {
    B bclass;
    bclass.sum();
    return 0;
}

Thanks

like image 887
Sriram Murali Avatar asked Oct 28 '11 21:10

Sriram Murali


People also ask

Is Abstract Syntax Tree and syntax tree same?

In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring in the text.

What is AST clang?

1.2 The Clang Abstract Syntax Tree. An Abstract Syntax Tree (AST) is the structural in-memory repre- sentation of a program's source code. Clang's AST mixes syntactic- only (such as parenthesis) and semantic-only (such as implicit con- versions) nodes into the same tree structure.

What is Abstract Syntax Tree explain with example?

Abstract Syntax Tree is a kind of tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code.

How do you create an Abstract Syntax Tree?

The Abstract Syntax Tree is generated using both the list of tokens (from the lexical analysis) and the source code. The AST is generated during the syntax analysis stage of the compilation. Any syntax error would be detected and a syntax error message would then be returned, stopping the compilation process.


2 Answers

There is a small confusion between the various options available:

  • -ast-print will pretty-print the current AST, that is, it will render the code it understood as closely as possible to what it parsed (but making some things explicit, like the apparition of the this)
  • -ast-dump will generate a lisp-like representation of the current AST

The pretty printer can be useful to check that the AST is lossless (ie, preserved the const-ness of such expression, etc...) but is not really about development.

If you want to hack on the compiler, you need -ast-dump, which will generate an output that maps directly the in-memory representation of the code that was parsed.

like image 78
Matthieu M. Avatar answered Sep 23 '22 22:09

Matthieu M.


The AST is a linked structure in memory ("tree" does not make justice to the complexity of the thing, but it's the name people use). What -ast-print produces is a textual representation of the AST. Since the human who set the option is already familiar with C/C++-like syntax, it is printed in a representation that follows that syntax. This is a design choice, not a happy coincidence.

If you want to see what the AST looks like when it's not printed on purpose in a familiar syntax, you could for instance look at GIMPLE, GCC's internal representation.

like image 33
Pascal Cuoq Avatar answered Sep 25 '22 22:09

Pascal Cuoq