Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there any Haskell techniques for mixed (part structure, part unmodified text) parsing & rewriting?

Tags:

Example problem: I want to write a Haskell script that will highlight (e.g. with \fbox) the first occurrence of mathematical symbols in a document. Hopefully, this will help me ensure that I've introduced everything.

  • Regex's are inappropriate, since they won't know what's in math mode, etc., and don't have the logic to count things, or know that a variable from the next \section is actually a new variable.

  • I also don't want to write a parser for all LaTeX. It seems the probability of mistakes is high, and I really just want to write a script, not a commercial program.

I wrote a mixed parser -- one that got some structure, and kept the rest as text, in a response to a question here. [ How do you use parsec in a greedy fashion? ]. But, my approach was cumbersome. Is there a better, more formal way?

like image 881
gatoatigrado Avatar asked Nov 20 '11 01:11

gatoatigrado


1 Answers

You might want to take a look at the Pandoc library on Hackage for parsing Latex. It will let you parse, modify, and pretty print Latex as well as a bunch of other formats.

like image 169
Anupam Jain Avatar answered Sep 27 '22 23:09

Anupam Jain