Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing really big log files (>1Gb, <5Gb)

I need to parse very large log files (>1Gb, <5Gb) - actually I need to strip the data into objects so I can store them in a DB. The log file is sequential (no line breaks), like:

TIMESTAMP=20090101000000;PARAM1=Value11;PARAM2=Value21;PARAM3=Value31;TIMESTAMP=20090101000100;PARAM1=Value11;PARAM2=Value21;PARAM3=Value31;TIMESTAMP=20090101000152;PARAM1=Value11;PARAM2=Value21;PARAM3=Value31;...

I need to strip this into the table:

TIMESTAMP | PARAM1 | PARAM2 | PARAM3

The process need to be as fast as possible. I'm considering using Perl, but any suggestions using C/C++ would be really welcome. Any ideas?

Best regards,

Arthur

like image 622
casals Avatar asked Nov 27 '22 02:11

casals


1 Answers

Write a prototype in Perl and compare its performance against how fast you can read data off of the storage medium. My guess is that you'll be I/O bound, which means that using C won't offer a performance boost.

like image 70
Dave Avatar answered Dec 05 '22 15:12

Dave