Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to create parser for TEX?

Tags:

c#

parsing

tex

I am looking to develop a TEX parser, now problem is there is no Context Free Grammar and there can not be one, because its not context free language, I heard that it is some sort of macro language and that builds on its own.

So now I need direction of what kind of grammar this macro language has and how to build anything upon in c#.

I will write the tokenizer and parser, but I need some rules of macro in TEX which are quite hard to find, everywhere else there is documentation about how to use TEX macros.

like image 555
Akash Kava Avatar asked Sep 28 '10 16:09

Akash Kava


2 Answers

TeX as a programming language is perhaps the most complex (non-esoteric) language ever created with a huge amount of "reserved words". You can remap the meaning of every character as it is read by the processor and in general do things you don't normally encounter while parsing a language.

If you really want to create your own TeX parser you will have to build on the original TeX. The source code is not only available, but it is written as a literate program using Knuth's ingenious WEB tool.

To complicate matters further you always use a macro package with TeX. The default package is Plain and the most well known is LaTeX. The macro package contains a non-trivial amount of code you will have to incorporate to be able to parse the particular "dialect" of TeX you want to parse.

like image 193
Martin Liversage Avatar answered Nov 18 '22 00:11

Martin Liversage


It depends on how much of TeX you actually want to implement. LaTeX2HTML is a perl project which converts LaTex to HTML. There's also MathJax which converts TeX math to HTML or MathML. If you want to see how some non-TeX programs parse TeX look at those.

like image 22
Matthew Leingang Avatar answered Nov 17 '22 23:11

Matthew Leingang