Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I parse just part of a file with Perl?

Tags:

perl

I'm a total newbie to Perl, but I've heard that it's great for parsing files, so I've thought of giving it a spin.

I have a text file that has the following sample info:

High school is used in some
parts of the world, particularly in
Scotland, North America and Oceania to
describe an institution that provides
all or part of secondary education.
The term "high school" originated in
Scotland with the world's oldest being
the Royal High School (Edinburgh) in
1505.

The Royal High School was used as a
model for the first public high school
in the United States, the English High
School founded in Boston,
Massachusetts, in 1821. The precise
stage of schooling provided by a high
school differs from country to
country, and may vary within the same
jurisdiction. In all of New Zealand
and Malaysia along with parts of
Australia and Canada, high school is
synonymous with secondary school, and
encompasses the entire secondary stage
of education.

======================================
Grade1 87.43%
Grade2 84.30%
Grade3 83.00%
=====================================

I want to parse the file and only get the numerical information. I looked into regex, and I think I'd use something like

if (m/^%/) {
    do something
}
else {
    skip the line
}

But, what I really want to do is keep track of the variable on the left and store the numerical value in that variable. So, after parsing the file, I'd really like to have the following variables to have the % value stored in them. The reason being, I want to create a pie-chart/bar graph of the different grades.

Grade1 = 87.43 Grade2 = 84.30

...

Could you'll suggest methods I should be looking at?

like image 626
c0d3rs Avatar asked Dec 29 '22 06:12

c0d3rs


2 Answers

You'll need a regular expression. Something like the following should work

while (<>) {
  /(Grade[0-9]+)\s*([0-9]+\.[0-9]+)/;
  $op{$1} = $2;
}

as a filter. The op hash will store the grade names and scores. This is preferable to automatically instantiating variables.

like image 179
Noufal Ibrahim Avatar answered Jan 15 '23 00:01

Noufal Ibrahim


If you can guarantee that your points of interest are nested between two =s (and there isn't an odd number of these demarcations in a given file), the flip-flop operator is a handy thing here:

use strict;    # These two pragmas go a long, ...
use warnings;  # ... long way in helping you code better

my %scores;    # Create a hash of scores

while (<>) {   # The diamond operator processes all files ...
               # ... supplied at command-line, line-by-line

    next unless /^=+$/ .. /^=+$/;  # The flip-flop operator used ...
                                   # ... to filter out only 'grades'

    my ( $name, $grade ) = split;  # This usage of split will break ...
                                   # ... the current line into an array    

    $scores{$name} = $grade;       # Associate grade with name
}
like image 44
Zaid Avatar answered Jan 14 '23 23:01

Zaid