Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I escape a literal string I want to interpolate into a regular expression?

Tags:

regex

perl

Is there a built-in way to escape a string that will be used within/as a regular expression? E.g.

www.abc.com

The escaped version would be:

www\.abc\.com

I was going to use:

$string =~ s/[.*+?|()\[\]{}\\]/\\$&/g; # Escapes special regex chars

But I just wanted to make sure that there's not a cleaner built-in operation that I'm missing?

like image 324
James Avatar asked Dec 22 '09 22:12

James


People also ask

What is the use of given statement in regular expression a za Z?

Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter.


2 Answers

Use quotemeta or \Q...\E.

Consider the following test program that matches against $str as-is, with quotemeta, and with \Q...\E:

#! /usr/bin/perl

use warnings;
use strict;

my $str = "www.abc.com";

my @test = (
  "www.abc.com",
  "www/abc!com",
);

sub ismatch($) { $_[0] ? "MATCH" : "NO MATCH" }

my @match = (
  [ as_is => sub { ismatch /$str/ } ],
  [ qmeta => sub { my $qm = quotemeta $str; ismatch /$qm/ } ],
  [ qe    => sub { ismatch /\Q$str\E/ } ],
);

for (@test) {
  print "\$_ = '$_':\n";

  foreach my $method (@match) {
    my($name,$match) = @$method;

    print "  - $name: ", $match->(), "\n";
  }
}

Notice in the output that using the string as-is could produce spurious matches:

$ ./try
$_ = 'www.abc.com':
  - as_is: MATCH
  - qmeta: MATCH
  - qe: MATCH
$_ = 'www/abc!com':
  - as_is: MATCH
  - qmeta: NO MATCH
  - qe: NO MATCH

For programs that accept untrustworthy inputs, be extremely careful about using such potentially nasty bits as regular expressions: doing so could create unexpected runtime errors, denial-of-service vulnerabilities, and security holes.

like image 92
Greg Bacon Avatar answered Sep 20 '22 07:09

Greg Bacon


The best way to do this is to use \Q to begin a quoted string and \E to end it.

my $foo = 'www.abc.com';
$bar =~ /blah\Q$foo\Eblah/;

You can also use quotemeta on the variable first. E.g.

my $quoted_foo = quotemeta($foo);

The \Q trick is documented in perlre under "Escape Sequences."

like image 42
friedo Avatar answered Sep 20 '22 07:09

friedo