Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternatives to Regular Expressions

I have a set of strings with numbers embedded in them. They look something like /cal/long/3/4/145:999 or /pa/metrics/CosmicRay/24:4:bgp:EnergyKurtosis. I'd like to have an expression parser that is

  • Easy to use. Given a few examples someone should be able to form a new expression. I want end users to be able to form new expressions to query this set of strings. Some of the potential users are software engineers, others are testers and some are scientists.
  • Allows for constraints on numbers. Something like '/cal/long/3/4/143:#>100&<1110' to specify that a string prefix with '/cal/long/3/4/143:' and then a number between (100,1110) is expected.
  • Supports '|' and . So the expression '/cal/(long|short)/3/4/' would match '/cal/long/3/4/1:2' as well as '/cal/short/3/4/1:2'.
  • Has a Java implementation available or would be easy to implement in Java.

Interesting alternative ideas would be useful. I'm also entertaining the idea of just implementing the subset of regular expressions that I need plus the numerical constraints.

Thanks!

like image 867
Sean McCauliff Avatar asked Feb 05 '09 02:02

Sean McCauliff


People also ask

Are regular expressions still used?

Despite being hard to read, hard to validate, hard to document and notoriously hard to master, regexes are still widely used today. Supported by all modern programming languages, text processing programs and advanced text editors, regexes are now used in more than a third of both Python and JavaScript projects.

Is there anything faster than regex?

String operations will always be faster than regular expression operations. Unless, of course, you write the string operations in an inefficient way. Regular expressions have to be parsed, and code generated to perform the operation using string operations.

Is regular expression necessary?

Regular expressions are an essential part of any programmer's toolkit. They can be very handy when you need to identify, replace or modify text, words, patterns or characters. In a nutshell: regular expressions (regex) are like a Swiss army knife for modifying strings of just about anything.

Is regex the same as regular expression?

A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.


1 Answers

There's no reason to reinvent the wheel! The core of a regular expression engine is built on a strong foundation of mathematics and computer science; the reason we continue to use them today is they are principally sound and won't be improved in the foreseeable future.

If you do find or create some alternative parsing language that only covers a subset of the possibilities Regex can, you will quickly have a user asking for a concept that can be expressed in Regex but your flavor just plain leaves out. Spend your time solving problems that haven't been solved instead!

like image 196
Rex M Avatar answered Oct 12 '22 08:10

Rex M