Programming is so new to me that I apologize for not knowing how to phrase the question.
I have a Perl script that gets a variable from an internal tool. This isn't always what it looks like, but it will always follow this pattern:
darren.local 1987 A Sentence1
darren.local 1996 C Sentence2
darren.local 1991 E Sentence3
darren.local 1954 G Sentence4
darren.local 1998 H Sentence5
With Perl, what's the easiest way to get each of these lines into a variable by itself? Depending on what the internal tool spits out each line will always be different and there can be more than five lines. The capitalized letter in each line is what it will end up being sorted by (all As, all Cs, all Es, etc.). Should I be looking at regular expressions?
I like using unpack for this sort of thing. It's fast, flexible, and reversible.
You just need to know the positions for each column, and unpack
can automatically trim the extra whitespace from each column.
If you change something in one of the columns, it's easy to go back to the original format by repacking with the same format:
my $format = 'A23 A8 A7 A*';
while( <DATA> ) {
chomp( my $line = $_ );
my( $machine, $year, $letter, $sentence ) =
unpack( $format, $_ );
# save the original line too, which might be useful later
push @grades, [ $machine, $year, $letter, $sentence, $_ ];
}
my @sorted = sort { $a->[2] cmp $b->[2] } @grades;
foreach my $tuple ( @sorted ) {
print $tuple->[-1];
}
# go the other way, especially if you changed things
foreach my $tuple ( @sorted ) {
print pack( $format, @$tuple[0..3] ), "\n";
}
__END__
darren.local 1987 A Sentence1
darren.local 1996 C Sentence2
darren.local 1991 E Sentence3
darren.local 1954 G Sentence4
darren.local 1998 H Sentence5
Now, there's an additional consideration. It sounds like you might have this big chunk of multi-line text in a single variable. Handle this as you would a file by opening a filehandle on a reference to the scalar. The filehandle stuff takes care of the rest:
my $lines = '...multiline string...';
open my($fh), '<', \ $lines;
while( <$fh> ) {
... same as before ...
}
use strict;
use warnings;
# this puts each line in the array @lines
my @lines = <DATA>; # <DATA> is a special filehandle that treats
# everything after __END__ as if it was a file
# It's handy for testing things
# Iterate over the array of lines and for each iteration
# put that line into the variable $line
foreach my $line (@lines) {
# Use split to 'split' each $line with the regular expression /s+/
# /s+/ means match one or more white spaces.
# the 4 means that all whitespaces after the 4:th will be ignored
# as a separator and be included in $col4
my ($col1, $col2, $col3, $col4) = split(/\s+/, $line, 4);
# here you can do whatever you need to with the data
# in the columns. I just print them out
print "$col1, $col2, $col3, $col4 \n";
}
__END__
darren.local 1987 A Sentece1
darren.local 1996 C Sentece2
darren.local 1991 E Sentece3
darren.local 1954 G Sentece4
darren.local 1998 H Sentece5
Assuming that the text is put into a single variable $info, then you can split it into separate lines using the intrinsic perl split function:
my @lines = split("\n", $info);
where @lines is an array of your lines. The "\n" is the regex for a newline. You can loop through each line as follows:
foreach (@lines) {
$line = $_;
# do something with $line....
}
You can then split each line on whitespace (regex \s+, where the \s is one whitespace character, and the + means 1 or more times):
@fields = split("\s+", $line);
and you can then access each field directly via its array index: $field[0], $field[1] etc.
or, you can do:
($var1, $var2, $var3, $var4) = split("\s+", $line);
which will put the fields in each line into seperate named variables.
Now - if you want to sort your lines by the character in the third column, you could do this:
my @lines = split("\n", $info);
my @arr = (); # declare new array
foreach (@lines) {
my @fields = split("\s+", $_);
push(@arr, \@fields) # add @fields REFERENCE to @arr
}
Now you have an "array of arrays". This can easily be sorted as follows:
@sorted = sort { $a->[2] <=> $b->[2] } @arr;
which will sort @arr by the 3rd element (index 2) of @fields.
Edit 2 To put lines with the same third column into their own variables, do this:
my %hash = (); # declare new hash
foreach $line (@arr) { # loop through lines
my @fields = @$line; # deference the field array
my $el = $fields[2]; # get our key - the character in the third column
my $val = "";
if (exists $hash { $el }) { # check if key already in hash
my $val = $hash{ $el }; # get the current value for key
$val = $val . "\n" . $line; # append new line to hash value
} else {
$val = $line;
}
$hash{ $el } = $val; # put the new value (back) into the hash
}
Now you have a hash keyed with the third column characters, with the value for each key being the lines that contain that key. You can then loop through the hash and print out or otherwise use the hash values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With