Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this regular expression faster?

Tags:

regex

ansi

I'm writing a Telnet client of sorts in C# and part of what I have to parse are ANSI/VT100 escape sequences, specifically, just those used for colour and formatting (detailed here).

One method I have is one to find all the codes and remove them, so I can render the text without any formatting if needed:

    
public static string StripStringFormating(string formattedString)
{
    if (rTest.IsMatch(formattedString))
        return rTest.Replace(formattedString, string.Empty);
    else
        return formattedString;
}

I'm new to regular expressions and I was suggested to use this:

static Regex rText = new Regex(@"\e\[[\d;]+m", RegexOptions.Compiled);

However, this failed if the escape code was incomplete due to an error on the server. So then this was suggested, but my friend warned it might be slower (this one also matches another condition (z) that I might come across later):

static Regex rTest = 
              new Regex(@"(\e(\[([\d;]*[mz]?))?)?", RegexOptions.Compiled);

This not only worked, but was in fact faster to and reduced the impact on my text rendering. Can someone explain to a regexp newbie, why? :)

like image 727
Nidonocu Avatar asked Aug 07 '08 15:08

Nidonocu


People also ask

Is regex faster than if else?

If your if/else simply checks the string's prefix, then it would be faster. If it is something more complicated, requiring multiple passes over the string, then you should measure performance yourself.

Is regex faster than for loop?

Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement.

Why is my regex slow?

The reason the regex is so slow is that the "*" quantifier is greedy by default, and so the first ". *" tries to match the whole string, and after that begins to backtrack character by character. The runtime is exponential in the count of numbers on a line.


1 Answers

Do you really want to do run the regexp twice? Without having checked (bad me) I would have thought that this would work well:

public static string StripStringFormating(string formattedString)
{    
    return rTest.Replace(formattedString, string.Empty);
}

If it does, you should see it run ~twice as fast...

like image 115
Oskar Avatar answered Sep 22 '22 11:09

Oskar