Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to ignore any empty values in a perl grep?

Tags:

arrays

grep

perl

I am using the following to count the number of occurrences of a pattern in a file:

my @lines = grep /$text/, <$fp>;
print ($#lines + 1);

But sometimes it prints one more than the actual value. I checked and it is because the last element of @lines is null, and that is also counted.

How can the last element of the grep result be empty sometimes? Also, how can this issue be resolved?

like image 265
Lazer Avatar asked Jul 08 '11 22:07

Lazer


3 Answers

It really depends a lot on your pattern, but one thing you could do is join a couple of matches, the first one disqualifying any line that contains only space (or nothing). This example will reject any line that is either empty, newline only, or any amount of whitespace only.

my @lines = grep { not /^\s*$/ and /$test/ } <$fp>;

Keep in mind that if the contents of $test happen to include regexp special metacharacters they either need to be intended for their metacharacter purposes, or sterilized with quotemeta().

My theories are that you might have a line terminated in \n which is somehow matching your $text regexp, or your $text regexp contains metacharacters in it that are affecting the match without you being aware. Either way, the snippet I provided will at least force rejection of "blank lines", where blank could mean completely empty (unlikely), newline terminated but otherwise empty (probable), or whitespace containing (possible) lines that appear blank when printed.

like image 69
DavidO Avatar answered Oct 18 '22 22:10

DavidO


A regular expression that matches the empty string will match undef. Perl will warn about doing so, but casts undef to '' before trying to match against it, at which point grep will quite happily promote the undef to its results. If you don't want to pick up the empty string (or anything that will be matched as though it were the empty string), you need to rewrite your regular expression to not match it.

like image 29
darch Avatar answered Oct 18 '22 23:10

darch


To accurately see what is in lines, do:

use Data::Dumper;
$Data::Dumper::Useqq = 1;
print Dumper \@lines;
like image 35
ysth Avatar answered Oct 18 '22 22:10

ysth