Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most effective way to parse C-like definition strings?

I've got a set of function definitions written in a C-like language with some additional keywords that can be put before some arguments(the same way as "unsigned" or "register", for example) and I need to analyze these lines as well as some function stubs and generate actual C code from them.

  • Is that correct that Flex/Yacc are the most proper way to do it?

  • Will it be slower than writing a Shell or Python script using regexps(which may become big pain, as I suppose, if the number of additional keywords becomes bigger and their effects would be rather different) provided that I have zero experience with analysers/parsers(though I know how LALR does its job)?

  • Are there any good materials on Lex/Yacc that cover similar problems? All papers I could find use the same primitive example of a "toy" calculator.

Any help will be appreciated.

like image 834
vovick Avatar asked Apr 27 '09 06:04

vovick


People also ask

What is a lexer vs parser?

A lexer is a software program that performs lexical analysis. ... A parser goes one level further than thelexer and takes the tokens produced by the lexer and tries to determine if proper sentences have been formed. Parsers work at the grammatical level, lexerswork at the word level.

Is C easy to parse?

C is a bit hard to parse because statements like `A * B();` will mean different things if A is defined as a type or note. C++ is much harder to parse because the template syntax is hard to disambiguate from less than or greater than.

Which parser is used in C?

The C/C++ parser is used for C and C++ language source files. The C/C++ parser uses syntax highlighting to identify language elements, including the following elements: Identifiers.

What are parsing techniques?

More Detail. Parsing is known as Syntax Analysis. It contains arranging the tokens as source code into grammatical phases that are used by the compiler to synthesis output generally grammatical phases of the source code are defined by parse tree.


1 Answers

ANTLR is commonly used (as are Lex\Yacc).

ANTLR, ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages.

like image 92
Mitch Wheat Avatar answered Nov 15 '22 08:11

Mitch Wheat