Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What language should I use to write a text parser and display the results in a user friendly manner? [closed]

Tags:

My company's proprietary software generates a log file that is much easier to use if it is parsed. The log parser we all use was written by another employee as a side project, and it has horrible performance.

These log files can grow to 10s of megabytes very quickly, and the parser we currently use has issues if a log file is bigger than 1 megabyte.

So, I want to write a program that can parse this massive amount of text in the shortest amount of time possible. We use Windows exclusively, so running on Windows is a must. Our current implementation runs on a local web server, and I'm convinced that running it as an application would have to be faster.

All suggestions will be helpful. Thanks.

EDIT: My ultimate goal is to parse the text and display it in a much more user friendly manner with colors and such. Can you do this with Perl and Python? I know you can do this with Java and C++. So, it will function like Notepad where you open a log file, but on the screen you display the user-friendly format instead of the raw file.

EDIT: So, I cant choose the best answer, and that was to choose a language that can best display what I'm going for, and then write the parser in that. Also, using ANTLR will probably make this process much easier. I changed the original question, since I guess I didn't ask what I was really looking for. Thanks everyone!

like image 539
HenryAdamsJr Avatar asked Mar 25 '10 21:03

HenryAdamsJr


People also ask

Is Python good for parsing text?

Parsing text using string methodsPython is incredible when it comes to dealing with strings. It is worth internalising all the common string operations. We can use these methods to extract data from a string as you can see in the simple example below.

How do you write parser in Python?

The basic workflow of a parser generator tool is quite simple: you write a grammar that defines the language, or document, and you run the tool to generate a parser usable from your Python code.

What is a parser in Python?

The parser module provides an interface to Python's internal parser and byte-code compiler. The primary purpose for this interface is to allow Python code to edit the parse tree of a Python expression and create executable code from this.


2 Answers

Hmmm, "go with what you know" was a good answer. Perl was designed for this sort of thing (but imo is well suited for simple parsing, but I'd personally avoid it for complex projects).

If it gets even a little complex, why not use a proper syntax and grammar set-up?

Lex & Yacc (or Flex & Bison) spring to mind, but personally I would always reach for Antlr

Define various "words" in terms of patterns (syntax), and rules to combine those words (grammar) and Antlr will spit out a program to parse your input (you can have the program in Java, C, C++ and more (you are worried about parse time, so choose a compiled language, of course)).

I personally find it tedious to hand-craft parsers, and even more tedious to debug them, but AntlrWorks is a lovely IDE which really makes it a piece of cake ...

That bit at the bottom is defining a grammar rule.

If you mess up your grammar rules, you will be informed. This is not the case with hand-crafted parsers, where you just scratch your body part and wonder about the "strange results"...

Check it out. Even if you think your project is trivial now, it may well grow. And if you have any interest in parsing you do owe it to yourself to at least be familiar with lex/yacc, but especially Antlr(Works)

like image 58
Mawg says reinstate Monica Avatar answered Oct 15 '22 16:10

Mawg says reinstate Monica


You should use the language that YOU know... Unless you have so much time available to complete the project that you can also spend the time learning a new language.

like image 30
David Avatar answered Oct 15 '22 15:10

David