Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't Regular Expressions use keywords instead of characters?

Okay, I barely understand RegEx basics, but why couldn't they design it to use keywords (like SQL) instead of some cryptic wildcard characters and symbols?

Is it for performance since the RegEx is interpreted/parsed at runtime? (not compiled)

Or maybe for speed of writing? Considering that when you learn some "simple" character combinations it becomes easier to type 1 character instead of a keyword?

like image 908
Robin Rodricks Avatar asked Mar 10 '09 10:03

Robin Rodricks


People also ask

Can regex replace characters?

RegEx can be effectively used to recreate patterns. So combining this with . replace means we can replace patterns and not just exact characters.

Is regex a keyword?

A regular expression is a template or pattern used to find multiple different strings. Regular expressions can be used to identify groups of related URLs in access limiting filters and exceptions from these and as a more flexible form of a keyword to assign URLs to categories for blocking.

Does regex only work on strings?

So, yes, regular expressions really only apply to strings. If you want a more complicated FSM, then it's possible to write one, but not using your local regex engine.


2 Answers

You really want this?

Pattern findGamesPattern = Pattern.With.Literal(@"<div")
    .WhiteSpace.Repeat.ZeroOrMore
    .Literal(@"class=""game""").WhiteSpace.Repeat.ZeroOrMore.Literal(@"id=""")
    .NamedGroup("gameId", Pattern.With.Digit.Repeat.OneOrMore)
    .Literal(@"-game""")
    .NamedGroup("content", Pattern.With.Anything.Repeat.Lazy.ZeroOrMore)
    .Literal(@"<!--gameStatus")
    .WhiteSpace.Repeat.ZeroOrMore.Literal("=").WhiteSpace.Repeat.ZeroOrMore
    .NamedGroup("gameState", Pattern.With.Digit.Repeat.OneOrMore)
    .Literal("-->");

Ok, but it's your funeral, man.

Download the library that does this here:
http://flimflan.com/blog/ReadableRegularExpressions.aspx

like image 189
Jeff Atwood Avatar answered Oct 01 '22 00:10

Jeff Atwood


Regular expressions have a mathematical (actually, language theory) background and are coded somewhat like a mathematical formula. You can define them by a set of rules, for example

  • every character is a regular expression, representing itself
  • if a and b are regular expressions, then a?, a|b and ab are regular expressions, too
  • ...

Using a keyword-based language would be a great burden for simple regular expressions. Most of the time, you will just use a simple text string as search pattern:

grep -R 'main' *.c

Or maybe very simple patterns:

grep -c ':-[)(]' seidl.txt

Once you get used to regular expressions, this syntax is very clear and precise. In more complicated situations you will probably use something else since a large regular expression is obviously hard to read.

like image 40
Ferdinand Beyer Avatar answered Oct 01 '22 01:10

Ferdinand Beyer