Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx standards across languages

Tags:

regex

I am asking this question because I notice there are some slight differences in the syntax of RegEx between different languages.

I am wondering if there is a RegEx standard that is maintained somewhere? And if so, where can I find this document? Also, if I create a RegEx expression in .NET, is the same expression guaranteed to be 100% compatible and work with other languages, such as Perl or Javascript or Java?

Finally, are there any "best practices" when it comes to using RegEx that can help to make it more maintainable across other platform languages?

like image 360
Icemanind Avatar asked Oct 05 '12 04:10

Icemanind


People also ask

Are regex rules same for all languages?

Regular expression synax varies slightly between languages but for the most part the details are the same. Some regex implementations support slightly different variations on how they process as well as what certain special character sequences mean.

Is regex different in different languages?

Short answer: yes.

Which language is best for regex?

Which Regex grammar system is supported in most programming languages? This has been the default library used by most language implementors for a few years now. The perl language remains the most regular expression friendly language in common usage and is therefore the de facto default dialect.

Are there different versions of regex?

The syntax and semantics of regexes have been standardized by IEEE as POSIX BRE and ERE . However, there are many non-standard variants. Often, the differences are subtle. Programmers who design regexes must be aware of the variant being used by the engine.


1 Answers

One of the oldest sets of standardized regular expressions are the POSIX BRE (basic regular expressions) and ERE (extended regular expressions), documented under Regular Expressions.

Other languages may define their own standards. For example, C++ 2011 has a regular expression library defined in clause 28 (about 46 pages of standard). Perl defines its regular expressions. Other languages borrow from these sources and others. Lex and Flex use their own set of regular expressions. Sed uses its own variant on regular expressions. And Java, JavaScript, and ... define their own versions, sometimes using PCRE (Perl-Compatible Regular Expressions) as the basis for their design. Some of the details are affected by the facilities provided by the language in which the regular expressions are being used.

Jeff Friedl's book Mastering Regular Expressions covers a lot of different sets of regular expressions, identifying what's common and what's different.

like image 107
Jonathan Leffler Avatar answered Sep 19 '22 05:09

Jonathan Leffler