Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

perl using constant in regex

Tags:

regex

perl

I'm wondering about using constants in perl regex's. I want to do something similar to:

use constant FOO => "foo"
use constant BAR => "bar"

$somvar =~ s/prefix1_FOO/prefix2_BAR/g;

of course, in there, FOO resolves to the three letters F O O instead of expanding to the constant.

I looked online, and someone was suggesting using either ${\FOO}, or @{[FOO]} Someone else mentioned (?{FOO}). I was wondering if anyone could shed some light on which of these is correct, and if there's any advantage to any of them. Alternatively, is it better to just use a non-constant variable? (performance is a factor in my case).

like image 450
user2766918 Avatar asked Feb 09 '17 14:02

user2766918


People also ask

How do I match a string in regex in Perl?

Simple word matching In this statement, World is a regex and the // enclosing /World/ tells Perl to search a string for a match. The operator =~ associates the string with the regex match and produces a true value if the regex matched, or false if the regex did not match.

What is * in Perl regex?

Regular Expression (Regex or Regexp or RE) in Perl is a special text string for describing a search pattern within a given text. Regex in Perl is linked to the host language and is not the same as in PHP, Python, etc. Sometimes it is termed as “Perl 5 Compatible Regular Expressions“.

What is K in regex?

\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match. To make the explanation short, consider the following simple Regex: a\Kb. When "b" is matched, \K tells the Regex engine to pretend that the match attempt started at this position.

What is =~ in Perl?

The '=~' operator is a binary binding operator that indicates the following operation will search or modify the scalar on the left. The default (unspecified) operator is 'm' for match. The matching operator has a pair of characters that designate where the regular expression begins and ends.


1 Answers

There's not much in the way of reasons to use a constant over a variable. It doesn't make a great deal of difference - perl will compile a regex anyway.

For example:

#!/usr/bin/perl

use warnings;
use strict;
use Benchmark qw(:all);

use constant FOO => "foo";
use constant BAR => "bar";

my $FOO_VAR = 'foo';
my $BAR_VAR = 'bar';

sub pattern_replace_const {
   my $somvar = "prefix1_foo test";
   $somvar =~ s/prefix1_${\FOO}/prefix2_${\BAR}/g;
}

sub pattern_replace_var {
   my $somvar = "prefix1_foo test";
   $somvar =~ s/prefix1_$FOO_VAR/prefix2_$BAR_VAR/g;
}

cmpthese(
   1_000_000,
   {  'const' => \&pattern_replace_const,
      'var'   => \&pattern_replace_var
   }
);

Gives:

          Rate const   var
const 917095/s    --   -1%
var   923702/s    1%    --

Really not enough in it to worry about.

However it may be worth noting - you can compile a regex with qr// and do it that way, which - provided the RE is static - might improve performance (but it might not, because perl can detect static regexes, and does that itself.

    Rate      var    const compiled
var      910498/s       --      -2%      -9%
const    933097/s       2%       --      -7%
compiled 998502/s      10%       7%       --

With code like:

my $compiled_regex = qr/prefix1_$FOO_VAR/;
sub compiled_regex { 
    my $somvar = "prefix1_foo test";
    $somvar =~ s/$compiled_regex/prefix2_$BAR_VAR/g;
}

Honestly though - this is a micro optimisation. The regex engine is fast compared to your code, so don't worry about it. If performance is critical to your code, then the correct way of dealing with it is first write the code, and then profile it to look for hotspots to optimise.

like image 164
Sobrique Avatar answered Sep 28 '22 01:09

Sobrique