We have some really old Perl code last updated 1997. I'm trying to upgrade to a newer Perl version where $*
is deprecated.
I've been trying to learn how to rewrite this but the only help you get from the perlvar documentation is "You should use the /s and /m regexp modifiers instead."
my ($file, $regexp, $flags) = @_;
my (@found_lines, @tmp_list, $comp_buf);
local ($*);
if ($flags =~ tr/c//d)
{
$* = 1;
(substr ($regexp, 0, 1) ne "^") && ($regexp = "^.*$regexp");
($regexp !~ /([^\\]|^)(\\\\)*\$$/) && ($regexp .= ".*\$");
&read_comp ($file, \$comp_buf);
@found_lines = grep ($_ .= "\n", ($comp_buf =~ /$regexp/g));
}
else
{
@tmp_list = &read_list ($file, 0);
@found_lines = grep (/$regexp/, @tmp_list);
}
if ($flags eq "q")
{
$#found_lines >= 0;
}
elsif ($flags eq "a")
{
$#found_lines+1;
}
else
{
@found_lines;
}
It's really hard to know how to replace $*
here for me, from what I can understand from the comments we use $*
here to enable multi-line matching for the following regexp search. So I'm guessing I have to add those flags to the regexp expressions somehow.
How do I rewrite this code to replace the existing $*
instances?
Unfortunately $*
is a global variable, so setting it has an effect on all called functions (e.g. read_comp
) if they use regexes.
Also, that code is written in a slightly bizarre way:
I assume the intention was to enable "multiline" matching for the $comp_buf =~ /$regexp/g
part, but $*
is set early, so it also affects $regexp !~ /([^\\]|^)(\\\\)*\$$/
and the read_comp
call.
The checks for whether $regexp
already starts/ends with ^
/$
respectively are broken. For example, (?:^foo$)
is an anchored regex, but the code would not detect that.
grep ($_ .= "\n", ...)
is a baffling abuse of grep
to emulate map
. What the code is trying to do is to get the list of lines matched by the regex. However, the way the regex is built it does not match the terminating newline character "\n"
on each line, so the code manually adds "\n"
to every returned string.
The sane way of doing that would be:
@found_lines = map $_ . "\n", ...; # or map "$_\n", ...
Instead of map
we could use an imperative loop, taking advantage of the fact that for
aliases the loop variable to the current list element:
@temp = ...;
for (@temp) {
$_ .= "\n";
}
@found_lines = @temp;
Instead of a for
loop we could use grep
for its side effect of iterating over a list:
@temp = ...;
grep $_ .= "\n", @temp;
@found_lines = @temp;
grep
also aliases $_
to the current element, so the "filter expression" can modify the list we're iterating over.
Finally, because .=
returns the resulting string (and strings containing "\n"
cannot be false), we can take advantage of the fact that our "filter expression" always returns a true value and effectively get a copy of the input list as the return value from grep
:
@found_lines = grep $_ .= "\n", ... # blergh
As for the effect of $*
: It is a boolean flag (initially false). If set to true, all regexes behave as if /m
is in effect, i.e. ^
and $
match at embedded newlines as well as the beginning/end of the string.
Assuming my interpretation of the code is correct, you should be able to change it as follows:
local ($*);
can be removed.$* = 1;
also needs to go.$comp_buf =~ /$regexp/g
should be changed to $comp_buf =~ /$regexp/mg
. This is the only place I see where multiline mode makes sense.I'd really like to rewrite the last line. Either
@found_lines = map "$_\n", ($comp_buf =~ /$regexp/g);
(functional style), or, if you prefer a more imperative style:
@found_lines = ($comp_buf =~ /$regexp/g);
$_ .= "\n" for @found_lines;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With