Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I convert ereg expressions to preg in PHP?

The biggest change in the syntax is the addition of delimiters.

ereg('^hello', $str);
preg_match('/^hello/', $str);

Delimiters can be pretty much anything that is not alpha-numeric, a backslash or a whitespace character. The most used are generally ~, / and #.

You can also use matching brackets:

preg_match('[^hello]', $str);
preg_match('(^hello)', $str);
preg_match('{^hello}', $str);
// etc

If your delimiter is found in the regular expression, you have to escape it:

ereg('^/hello', $str);
preg_match('/^\/hello/', $str);

You can easily escape all delimiters and reserved characters in a string by using preg_quote:

$expr = preg_quote('/hello', '/');
preg_match('/^'.$expr.'/', $str);

Also, PCRE supports modifiers for various things. One of the most used is the case-insensitive modifier i, the alternative to eregi:

eregi('^hello', 'HELLO');
preg_match('/^hello/i', 'HELLO');

You can find the complete reference to PCRE syntax in PHP in the manual, as well as a list of differences between POSIX regex and PCRE to help converting the expression.

However, in your simple example you would not use a regular expression:

stripos($str, 'hello world') === 0

Ereg replacement with preg(as of PHP 5.3.0) was right move in our favor.

preg_match, which uses a Perl-compatible regular expression syntax, is often a faster alternative to ereg.

You should know 4 main things to port ereg patterns to preg:

  1. Add delimiters(/): 'pattern' => '/pattern/'

  2. Escape delimiter if it is a part of the pattern: 'patt/ern' => '/patt\/ern/'
    Achieve it programmatically in following way:
    $old_pattern = '<div>.+</div>';
    $new_pattern = '/' . addcslashes($old_pattern, '/') . '/';

  3. eregi(case-insensitive matching): 'pattern' => '/pattern/i' So, if you are using eregi function for case insenstive matching, just add 'i' in the end of new pattern('/pattern/').

  4. ASCII values: In ereg, if you use number in the pattern, it is assumed that you are referring to the ASCII of a character. But in preg, number is not treated as ASCII value. So, if your pattern contain ASCII value in the ereg expression(for example: new line, tabs etc) then convert it to hexadecimal and prefix it with \x.
    Example: 9(tab) becomes \x9 or alternatively use \t.


From PHP version 5.3, ereg is deprecated.

Moving from ereg to preg_match is just a small change in our pattern.

First, you have to add delimiters to your code, e.g.:

ereg('A-Z0-9a-z', 'string');

to

preg_match('/A-Z0-9a-z/', 'string');

For eregi case-insensitive matching, put i after the last delimiter, e.g.:

eregi('pattern', 'string');

to

preg_match ('/pattern/i', 'string');

There are more differences between ereg() and preg_replace() than just the syntax:

  • Return value:

    • On error: both return FALSE
    • On no match: ereg() returns FALSE, preg_match() returns 0
    • On match: ereg() returns string length or 1, preg_match() returns always 1
  • Resulting array of matched substrings: If some substring is not found at all ((b) in ...a(b)?), corresponding item in ereg() result will be FALSE, while in preg_match() it will not be set at all.

If one is not brave enough to convert his or her ereg() to preg_match(), he or she may use mb_ereg(), which is still available in PHP 7.