Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP preg_match length 3276 limit

Tags:

php

preg-match

It appears that PHP's preg_match has a 3276 character limit for matching repeating characters in some cases.

i.e.

^(.|\s){0,3276}$ works, but ^(.|\s){0,3277}$ does not.

It doesn't seem to always apply, as /^(.){0,3277}$/ works.

I can't find this mentioned anywhere in PHP's documentation or the bug tracker. The number 3276 seems a bit of an odd boundary, the only thing I can think of is that it's approximately 1/10th of 32767, which is the limit for a signed 16-bit integer.

preg_last_error() returns 0.

I've reproduced the issue on http://www.phpliveregex.com/ as well as my local system and the webserver.

EDIT: Looks like we're getting "Warning: preg_match(): Compilation failed: regular expression is too large at offset 16" out of the code, so it appears to be the same issue as PHP preg_match_all limit.

However, the regex itself isn't very large... Does PHP do some kind of expansion when you have repeating groups that's making it too large?

like image 606
Stu Avatar asked Jul 29 '13 14:07

Stu


People also ask

What is the return value of the preg_match() method?

preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.

Why does preg_match_all () not work with very long strings?

preg_match_all () and other preg_* () functions doesn't work well with very long strings, at least longer that 1Mb. In this case case function returns FALSE and $matchers value is unpredictable, may contain some values, may be empty.

What does the preg_match () function return?

The preg_match () function returns whether a match was found in a string. Required. Contains a regular expression indicating what to search for Required. The string in which the search will be performed

What happens after the first match is found in a string?

After the first match is found, the subsequent searches are continued on from end of the last match. The pattern to search for, as a string. The input string. Array of all matches in multi-dimensional array ordered according to flags .


1 Answers

In order to handle Perl-compatible regular expressions, PHP just bundles a third-party library that takes care of the job. The behaviour you describe is actually documented:

The "*" quantifier is equivalent to {0,} , the "+" quantifier to {1,} , and the "?" quantifier to {0,1} . n and m are limited to non-negative integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms.

So there's always a hard limit. Why do your tests suggest that PHP limit is 10 times smaller than the typical one? No idea about that :)

like image 74
Álvaro González Avatar answered Oct 27 '22 00:10

Álvaro González