Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby data extraction from a text file

Tags:

ruby

I have a relatively big text file with blocks of data layered like this:

ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  0.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 0.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 0.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0

(they contain more lines and then are repeated)

I would like first to extract the numerical value after TUNE X = and output these in a text file. Then I would like to extract the numerical value of LINE FREQUENCY and AMPLITUDE as a pair of values and output to a file.

My question is the following: altough I could make something moreorless working using a simple REGEXP I'm not convinced that it's the right way to do it and I would like some advices or examples of code showing how I can do that efficiently with Ruby.

like image 572
Cedric H. Avatar asked Apr 01 '11 07:04

Cedric H.


3 Answers

Generally, (not tested)

toggle=0
File.open("file").each do |line|
    if line[/TUNE/]
        puts line.split("=",2)[-1].strip
    end
    if line[/Line Frequency/]
        toggle=1
        next
    end
    if toggle
        a = line.split
        puts "#{a[1]} #{a[2]}"
    end
end

go through the file line by line, check for /TUNE/, then split on "=" to get last item. Do the same for lines containing /Line Frequency/ and set the toggle flag to 1. This signify that the rest of line contains the data you want to get. Since the freq and amplitude are at fields 2 and 3, then split on the lines and get the respective positions. Generally, this is the idea. As for toggling, you might want to set toggle flag to 0 at the next block using a pattern (eg SIGNAL CASE or ANALYSIS)

like image 187
kurumi Avatar answered Nov 15 '22 05:11

kurumi


file = File.open("data.dat")
@tune_x = @frequency = @amplitude = []
file.each_line do |line|
  tune_x_scan = line.scan /TUNE X =  (\d*\.\d*)/
  data_scan = line.scan /(\d*\.\d*E[-|+]\d*)/
  @tune_x << tune_x_scan[0] if tune_x_scan
  @frequency << data_scan[0] if data_scan
  @amplitude << data_scan[0] if data_scan
end
like image 31
fl00r Avatar answered Nov 15 '22 04:11

fl00r


There are lots of ways to do it. This is a simple first pass at it:

text = 'ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  0.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 0.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 0.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0

ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  1.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 1.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 1.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0

ANALYSIS OF X SIGNAL, CASE: 1
TUNE X =  2.2561890123390808

    Line Frequency      Amplitude             Phase             Error         mx  my  ms  p

1 2.2561890123391E+00 0.204316425208E-01 0.164145385871E+03 0.00000000000E+00   1   0   0   0
2 2.2562865535359E+00 0.288712798671E-01 -.161563284233E+03 0.97541196785E-04   1   0   0   0
'

require 'stringio'
pretend_file = StringIO.new(text, 'r')

That gives us a StringIO object we can pretend is a file. We can read from it by lines.

I changed the numbers a bit just to make it easier to see that they are being captured in the output.

pretend_file.each_line do |li|
  case

  when li =~ /^TUNE.+?=\s+(.+)/
    print $1.strip, "\n"

  when li =~ /^\d+\s+(\S+)\s+(\S+)/
    print $1, ' ', $2, "\n"

  end
end

For real use you'd want to change the print statements to a file handle: fileh.print

The output looks like:

# >> 0.2561890123390808
# >> 0.2561890123391E+00 0.204316425208E-01
# >> 0.2562865535359E+00 0.288712798671E-01
# >> 1.2561890123390808
# >> 1.2561890123391E+00 0.204316425208E-01
# >> 1.2562865535359E+00 0.288712798671E-01
# >> 2.2561890123390808
# >> 2.2561890123391E+00 0.204316425208E-01
# >> 2.2562865535359E+00 0.288712798671E-01
like image 40
the Tin Man Avatar answered Nov 15 '22 04:11

the Tin Man