Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

preg_match fails to find simple regex

Due to some NDAs, the amount of information I can really disclose here is small. Unfortunately, nobody where I am has an answer for me, so I'm turning to Stack Overflow. The basics are this: in PHP, I am downloading a large-ish file (73000 characters) from an SVN repository using HTTP (either with cURL or file_get_contents), and searching for rules. All the rules are annotated with @rule, so the regex to find them ought to be

/(?<=@RULE).+?$/im

I've tested it, it works. Problem is, even though the file is downloading properly and being converted to a string (var_dumps have ensured this)

preg_match('/RU/',$file, $rules);

leaves $rules completely empty, despite the fact that I can SEE the appropriate matches in the var_dumped strings. I'm at my wit's end trying to figure out what's going on. No errors are being thrown (it returns 0), it doesn't seem to be running out of memory, it just tells me "Nope, nothing in there, George." Interestingly, it will find

/R/

just fine. Any ideas out there?

like image 331
FrankieTheKneeMan Avatar asked Dec 14 '25 16:12

FrankieTheKneeMan


1 Answers

Since you're only matching ASCII, the only thing I can think of is that the text format is in UTF-16 which, in the case of ASCII, adds a '\0' after each character.

If that's the case, before running preg_match() you run this:

$file = mb_convert_encoding($file, 'UTF-8', 'UTF-16');
like image 87
Ja͢ck Avatar answered Dec 16 '25 09:12

Ja͢ck



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!