I recall I have read about a parser which you just have to feed some sample lines, for it to know how to parse some text.
It just determines the difference between two lines to know what the variable parts are. I thought it was written in python, but i'm not sure. Does anyone know what library that was?
Probably you mean TemplateMaker, I haven't tried it yet, but it builds on well-researched longest-common-substring algorithms and thus should work reasonably... If you are interested in different (more complex) approaches, you can easily find a lot of material on Google Scholar using the query "wrapper induction" or "template induction".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With