Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disabling backreferences in perl

I have been told that disabling backreferences in perl improves performance (provided you're not using them), and that if you don't use any backreferences perl will do this by itself.

Now I have a perl script with a large number of regex in it and only a single one uses a backreference and I would like to know the following:

  • Given I have a very large number of regex (let's assuming most of my processing time is regex) does disabling back references a significant performance improvement? or are there criteria which I can use to know if this is the case?
  • Is there a way I can disable backreferences once at the beginning and only reenable it when I need it (I know about (?:, but I don't want to have to add it to every grouping)?
  • Would scoping allow for perl to optimize this backreferencing behavior for me (ie. does a sub or an eval change whether perl turns off backreferencing for things outside of it)?
like image 355
tzenes Avatar asked Sep 28 '10 17:09

tzenes


People also ask

What is the meaning of $1 in Perl regex?

$1 equals the text " brown ".

What does S mean in Perl?

Substitution Operator or 's' operator in Perl is used to substitute a text of the string with some pattern specified by the user.

What is \W in Perl regex?

A \w matches a single alphanumeric character (an alphabetic character, or a decimal digit) or _ , not a whole word. Use \w+ to match a string of Perl-identifier characters (which isn't the same as matching an English word).


1 Answers

Using capturing parentheses only penalizes regular expressions that use them, so use them where you need to capture, but use non-capturing parens (?:...) when all you need is grouping.

Using any of the global match variables

$` $& $'

imposes a performance penalty on all regular expressions, so avoid using them if at all possible. (But once you do, go nuts! You've already paid the price.) There's no way to turn this on and off. Once Perl detects that they're used anywhere (even in third-party modules you may use) the feature is turned on.

As of Perl 5.10.0, there are alternatives for the global match variables that only penalize regular expressions that use them. If you add the /p modifier to a particular regular expression you can then use

${^PREMATCH} ${^MATCH} ${^POSTMATCH}

instead.

like image 109
Michael Carman Avatar answered Sep 20 '22 05:09

Michael Carman