Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create an AST from bash in C

I am trying to build a mini Bash interpreter in C without the help of any fancy library (i.e from scratch). I have to manage simple operators like '<', '|', '<<', '>>', '>'.
I am told to build an AST of the input to facilitate the execution process. The thing is that I don't understand how I am supposed to build one.

So far I made a linked-list of my input turned into tokens but can't figure how to make an AST out of it:

typedef struct       s_token
{
    enum e_TokenType type;
    char             *lexeme;
    struct s_token   *prev;
    struct s_token   *next;
}                    t_token;

Could you explain me how to turn this into a functionnal AST ? For example with this input:

cat << EOF > file | wc -c | tr -d " " > file2

I imagine the AST would look like:

enter image description here I have seen other posts describing how to but they were in JS/Python (I am not familiar with these languages) and using libraries for the part that interests me.

like image 850
tblaudez Avatar asked Oct 05 '18 13:10

tblaudez


1 Answers

I would create an AST more like this:

                         __ PIPELINE__
                     ___/              \____
                    /                       \
            COMMAND                    __ PIPELINE _
          /        \                  /             \
    ARGUMENTS   REDIRECTIONS      COMMAND         _ COMMAND __
        |          |     |           |           /            \
       cat        <<     >       ARGUMENTS    ARGUMENTS   REDIRECTIONS
                   |     |         |   |      |   |   |        |
                 "..."  file      wc  -c      tr  -d " "       >
                                                               |
                                                             file2

Notable differences from yours:

  • Commands are composed of a list of arguments and a list of redirections.
  • Redirections have a type (<, >, >>, etc.) and a either a string or a file as source/target.
  • The EOF heredoc is converted to a plain string node ("..." above). A here document is ultimately just syntactical sugar for a string. If I were doing it I'd handle EOF at the lexing stage and turn it into a simple string for the parser to handle. EOF wouldn't show up in the AST.

It's a rough sketch, but the idea is to represent the components in a more logical manner. The way you drew yours, > and << look like binary operators with the other pieces as operands. That's how you'd parse a C program, but not a shell command.

like image 185
John Kugelman Avatar answered Oct 22 '22 18:10

John Kugelman