Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing structured file in Ruby

Tags:

ruby

I want to parse a large log file (about 500mb). If this isnt the right tool for the job please let me know.

I have a log file with its contents structured like this. Each section can have extra key value pairs:

requestID: saldksadk
time: 92389389
action: foobarr
----------------------
requestID: 2393029
time: 92389389
action: helloworld
source: email
----------------------
requestID: skjflkjasf3
time: 92389389
userAgent: mobile browser
----------------------
requestID: gdfgfdsdf
time: 92389389
action: randoms

I was wondering if there is an easy way to handle each section's data in the log. A section can span multiple lines, so I can't just split the string. For example, is there an easy way to do something like this:

for(section in log){
   // handle section contents
}
like image 581
thunderousNinja Avatar asked Feb 24 '26 12:02

thunderousNinja


2 Answers

Using icktoofay's idea, and by using a custom record separator, I got this:

require 'yaml'

File.open("path/to/file") do |f|
  f.each_line("\n----------------------\n") do |line|
    puts YAML::load(line.sub(/\-{3,}/, "---")).inspect
  end
end

The output:

{"requestID"=>"saldksadk", "time"=>92389389, "action"=>"foobarr"}
{"requestID"=>2393029, "time"=>92389389, "action"=>"helloworld", "source"=>"email"}
{"requestID"=>"skjflkjasf3", "time"=>92389389, "userAgent"=>"mobile browser"}
{"requestID"=>"gdfgfdsdf", "time"=>92389389, "action"=>"randoms"}
like image 72
ian Avatar answered Feb 26 '26 08:02

ian


That looks like YAML, although it is not exactly YAML. (YAML separates documents with exactly three dashes, no more.) You might try to mangle your document somehow such that lines consisting of only hyphens are collapsed into three hyphens so it is valid YAML. After that, you can feed it into a YAML parser.

like image 33
icktoofay Avatar answered Feb 26 '26 08:02

icktoofay



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!