What would be faster?
public String Roll() { Random rnd = new Random(); int roll = rnd.Next(1, 100000); if (Regex.IsMatch(roll.ToString(), @"(.)\1{1,}$")) { return "doubles"; } return "none"; }
Or
public String Roll() { Random rnd = new Random(); int roll = rnd.Next(1, 100000); if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22") || roll.ToString().EndsWith("33") || roll.ToString().EndsWith("44") || roll.ToString().EndsWith("55") || roll.ToString().EndsWith("66") || roll.ToString().EndsWith("77") || roll.ToString().EndsWith("88") || roll.ToString().EndsWith("99") || roll.ToString().EndsWith("00")) { return "doubles"; } return "none"; }
I'm currently using a really long if-statement list full with regexes to check if an int ends with doubles, triples, quads, quints, etc... so I would like to know which one is faster before I change all of it.
The bad regular expression took on average 10,100 milliseconds to process all 1,000,000 lines, while the good regular expression took just 240 milliseconds.
String operations will always be faster than regular expression operations. Unless, of course, you write the string operations in an inefficient way. Regular expressions have to be parsed, and code generated to perform the operation using string operations.
Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement.
In this case it's the 'OR's in the EndsWith that make it slower than the Regex. On the other hand, a Regex is usually something you use multiple times, which means that you only have to compile it once, and simply look it up for each call.
There's overhead in determining the NFA/DFA from the expression, compiling the program and for each call looking up the program and evaluating it. For very simple cases, an EndsWith beats the Regex. In this case it's the 'OR's in the EndsWith that make it slower than the Regex.
Less obvious are distant but overlapping repetitions ( /.*-some text-.*;/ ). More symptoms in can be found this well-written post. Benchmarks may assure that regex has good performance. However, it’s not enough to test it on a single matching string. We need to try to move matching part inside the test string.
Being more specific with your regular expressions, even if they become much longer, can make a world of difference in performance. The fewer characters you scan to determine the match, the faster your regexes will be.
In your particular case, Regex
is actually faster... but it is likely because you use EndsWith
with many OR
and redundant ToString()
. If you simplify the logic, simple string
operation will likely be faster.
Here is the performance summary for text processing - from the fastest to the slowest (10 millions loop [Prefer/Non-Prefer 32-bit] - rank is ordered based on the fastest of the two):
UInt
(not counted for bounty): 219/273 ms - Mine, improved from Evk's Large Lookup Parameterless Random: 248/287 ms - Thomas Ayoub
There are few notes I want to make on this solution (based on the comments below it):
Random
(from Random.Next(int)
to Random.Next()
), which is used for the testing itself. While the testing cannot be performed with the exact same number sequence for this method as for the rests (as mentioned by Phil1970), I have two points to make:
Random
in the real case itself will (most of the time) not assume the number sequence to be the same (or predictable) - It is only for testing our method we want the Random
sequence to be identical (for fair unit testing purpose). The expected faster result for this method cannot be fully derived from the testing result alone, but by also looking at the Next() vs Next(int)
implementation.Large Look-up: 320/284 ms - Evk
Here are my simple test cases:
Random rnd = new Random(10000); FastRandom fastRnd = new FastRandom(10000); //OP's Solution 2 public String RollRegex() { int roll = rnd.Next(1, 100000); if (Regex.IsMatch(roll.ToString(), @"(.)\1{1,}$")) { return "doubles"; } else { return "none"; } } //Radin Gospodinov's Solution Regex optionRegex = new Regex(@"(.)\1{1,}$", RegexOptions.Compiled); public String RollOptionRegex() { int roll = rnd.Next(1, 100000); string rollString = roll.ToString(); if (optionRegex.IsMatch(roll.ToString())) { return "doubles"; } else { return "none"; } } //OP's Solution 1 public String RollEndsWith() { int roll = rnd.Next(1, 100000); if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22") || roll.ToString().EndsWith("33") || roll.ToString().EndsWith("44") || roll.ToString().EndsWith("55") || roll.ToString().EndsWith("66") || roll.ToString().EndsWith("77") || roll.ToString().EndsWith("88") || roll.ToString().EndsWith("99") || roll.ToString().EndsWith("00")) { return "doubles"; } else { return "none"; } } //My Solution public String RollSimple() { int roll = rnd.Next(1, 100000); string rollString = roll.ToString(); return roll > 10 && rollString[rollString.Length - 1] == rollString[rollString.Length - 2] ? "doubles" : "none"; } //My Other Solution List<string> doubles = new List<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" }; public String RollAnotherSimple() { int roll = rnd.Next(1, 100000); string rollString = roll.ToString(); return rollString.Length > 1 && doubles.Contains(rollString.Substring(rollString.Length - 2)) ? "doubles" : "none"; } //Dandré's Solution HashSet<string> doublesHashset = new HashSet<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" }; public String RollSimpleHashSet() { int roll = rnd.Next(1, 100000); string rollString = roll.ToString(); return rollString.Length > 1 && doublesHashset.Contains(rollString.Substring(rollString.Length - 2)) ? "doubles" : "none"; } //Corak's Solution - hinted by Alexei Levenkov too public string RollModded() { int roll = rnd.Next(1, 100000); return roll % 100 % 11 == 0 ? "doubles" : "none"; } //Stian Standahl optimizes modded solution public string RollOptimizedModded() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? "doubles" : "none"; } //Gjermund Grøneng's method with constant addition private const string CONST_DOUBLES = "doubles"; private const string CONST_NONE = "none"; public string RollOptimizedModdedConst() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; } //Gjermund Grøneng's method after optimizing the Random (The fastest!) public string FastestOptimizedModded() { return (rnd.Next(99999) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; } //Corak's Solution, added on Gjermund Grøneng's private readonly string[] Lookup = { "doubles", "none", "none", "none", "none", "none", "none", "none", "none", "none", "none" }; public string RollLookupOptimizedModded() { return Lookup[(rnd.Next(99999) + 1) % 100 % 11]; } //Evk's Solution, large Lookup private string[] LargeLookup; private void InitLargeLookup() { LargeLookup = new string[100000]; for (int i = 0; i < 100000; i++) { LargeLookup[i] = i % 100 % 11 == 0 ? "doubles" : "none"; } } public string RollLargeLookup() { return LargeLookup[rnd.Next(99999) + 1]; } //Thomas Ayoub's Solution public string RollLargeLookupParameterlessRandom() { return LargeLookup[rnd.Next() % 100000]; } //Alois Kraus's Solution public string RollNumbers() { int roll = rnd.Next(1, 100000); int lastDigit = roll % 10; int secondLastDigit = (roll / 10) % 10; if (lastDigit == secondLastDigit) { return "doubles"; } else { return "none"; } } //Ivan Stoev's Solution public string FastestOptimizedRandomModded() { return ((int)(rnd.Next() * (99999.0 / int.MaxValue)) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; } //Ivan Stoev's Alternate Solution public string RollLargeLookupOptimizedRandom() { return LargeLookup[(int)(rnd.Next() * (99999.0 / int.MaxValue))]; } //Evk's Solution using FastRandom public string RollLargeLookupFastRandom() { return LargeLookup[fastRnd.Next(99999) + 1]; } //My Own Test, using FastRandom + NextUInt public string RollLargeLookupFastRandomUInt() { return LargeLookup[fastRnd.NextUInt() % 99999 + 1]; }
The additional FastRandom
class:
//FastRandom's part used for the testing public class FastRandom { // The +1 ensures NextDouble doesn't generate 1.0 const double REAL_UNIT_INT = 1.0 / ((double)int.MaxValue + 1.0); const double REAL_UNIT_UINT = 1.0 / ((double)uint.MaxValue + 1.0); const uint Y = 842502087, Z = 3579807591, W = 273326509; uint x, y, z, w; #region Constructors /// <summary> /// Initialises a new instance using time dependent seed. /// </summary> public FastRandom() { // Initialise using the system tick count. Reinitialise((int)Environment.TickCount); } /// <summary> /// Initialises a new instance using an int value as seed. /// This constructor signature is provided to maintain compatibility with /// System.Random /// </summary> public FastRandom(int seed) { Reinitialise(seed); } #endregion #region Public Methods [Reinitialisation] /// <summary> /// Reinitialises using an int value as a seed. /// </summary> /// <param name="seed"></param> public void Reinitialise(int seed) { // The only stipulation stated for the xorshift RNG is that at least one of // the seeds x,y,z,w is non-zero. We fulfill that requirement by only allowing // resetting of the x seed x = (uint)seed; y = Y; z = Z; w = W; } #endregion #region Public Methods [System.Random functionally equivalent methods] /// <summary> /// Generates a random int over the range 0 to int.MaxValue-1. /// MaxValue is not generated in order to remain functionally equivalent to System.Random.Next(). /// This does slightly eat into some of the performance gain over System.Random, but not much. /// For better performance see: /// /// Call NextInt() for an int over the range 0 to int.MaxValue. /// /// Call NextUInt() and cast the result to an int to generate an int over the full Int32 value range /// including negative values. /// </summary> /// <returns></returns> public int Next() { uint t = (x ^ (x << 11)); x = y; y = z; z = w; w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)); // Handle the special case where the value int.MaxValue is generated. This is outside of // the range of permitted values, so we therefore call Next() to try again. uint rtn = w & 0x7FFFFFFF; if (rtn == 0x7FFFFFFF) return Next(); return (int)rtn; } /// <summary> /// Generates a random int over the range 0 to upperBound-1, and not including upperBound. /// </summary> /// <param name="upperBound"></param> /// <returns></returns> public int Next(int upperBound) { if (upperBound < 0) throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=0"); uint t = (x ^ (x << 11)); x = y; y = z; z = w; // The explicit int cast before the first multiplication gives better performance. // See comments in NextDouble. return (int)((REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * upperBound); } /// <summary> /// Generates a random int over the range lowerBound to upperBound-1, and not including upperBound. /// upperBound must be >= lowerBound. lowerBound may be negative. /// </summary> /// <param name="lowerBound"></param> /// <param name="upperBound"></param> /// <returns></returns> public int Next(int lowerBound, int upperBound) { if (lowerBound > upperBound) throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=lowerBound"); uint t = (x ^ (x << 11)); x = y; y = z; z = w; // The explicit int cast before the first multiplication gives better performance. // See comments in NextDouble. int range = upperBound - lowerBound; if (range < 0) { // If range is <0 then an overflow has occured and must resort to using long integer arithmetic instead (slower). // We also must use all 32 bits of precision, instead of the normal 31, which again is slower. return lowerBound + (int)((REAL_UNIT_UINT * (double)(w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))) * (double)((long)upperBound - (long)lowerBound)); } // 31 bits of precision will suffice if range<=int.MaxValue. This allows us to cast to an int and gain // a little more performance. return lowerBound + (int)((REAL_UNIT_INT * (double)(int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * (double)range); } /// <summary> /// Generates a random double. Values returned are from 0.0 up to but not including 1.0. /// </summary> /// <returns></returns> public double NextDouble() { uint t = (x ^ (x << 11)); x = y; y = z; z = w; // Here we can gain a 2x speed improvement by generating a value that can be cast to // an int instead of the more easily available uint. If we then explicitly cast to an // int the compiler will then cast the int to a double to perform the multiplication, // this final cast is a lot faster than casting from a uint to a double. The extra cast // to an int is very fast (the allocated bits remain the same) and so the overall effect // of the extra cast is a significant performance improvement. // // Also note that the loss of one bit of precision is equivalent to what occurs within // System.Random. return (REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))); } /// <summary> /// Fills the provided byte array with random bytes. /// This method is functionally equivalent to System.Random.NextBytes(). /// </summary> /// <param name="buffer"></param> public void NextBytes(byte[] buffer) { // Fill up the bulk of the buffer in chunks of 4 bytes at a time. uint x = this.x, y = this.y, z = this.z, w = this.w; int i = 0; uint t; for (int bound = buffer.Length - 3; i < bound; ) { // Generate 4 bytes. // Increased performance is achieved by generating 4 random bytes per loop. // Also note that no mask needs to be applied to zero out the higher order bytes before // casting because the cast ignores thos bytes. Thanks to Stefan Troschьtz for pointing this out. t = (x ^ (x << 11)); x = y; y = z; z = w; w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)); buffer[i++] = (byte)w; buffer[i++] = (byte)(w >> 8); buffer[i++] = (byte)(w >> 16); buffer[i++] = (byte)(w >> 24); } // Fill up any remaining bytes in the buffer. if (i < buffer.Length) { // Generate 4 bytes. t = (x ^ (x << 11)); x = y; y = z; z = w; w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)); buffer[i++] = (byte)w; if (i < buffer.Length) { buffer[i++] = (byte)(w >> 8); if (i < buffer.Length) { buffer[i++] = (byte)(w >> 16); if (i < buffer.Length) { buffer[i] = (byte)(w >> 24); } } } } this.x = x; this.y = y; this.z = z; this.w = w; } // /// <summary> // /// A version of NextBytes that uses a pointer to set 4 bytes of the byte buffer in one operation // /// thus providing a nice speedup. The loop is also partially unrolled to allow out-of-order-execution, // /// this results in about a x2 speedup on an AMD Athlon. Thus performance may vary wildly on different CPUs // /// depending on the number of execution units available. // /// // /// Another significant speedup is obtained by setting the 4 bytes by indexing pDWord (e.g. pDWord[i++]=w) // /// instead of adjusting it dereferencing it (e.g. *pDWord++=w). // /// // /// Note that this routine requires the unsafe compilation flag to be specified and so is commented out by default. // /// </summary> // /// <param name="buffer"></param> // public unsafe void NextBytesUnsafe(byte[] buffer) // { // if(buffer.Length % 8 != 0) // throw new ArgumentException("Buffer length must be divisible by 8", "buffer"); // // uint x=this.x, y=this.y, z=this.z, w=this.w; // // fixed(byte* pByte0 = buffer) // { // uint* pDWord = (uint*)pByte0; // for(int i=0, len=buffer.Length>>2; i < len; i+=2) // { // uint t=(x^(x<<11)); // x=y; y=z; z=w; // pDWord[i] = w = (w^(w>>19))^(t^(t>>8)); // // t=(x^(x<<11)); // x=y; y=z; z=w; // pDWord[i+1] = w = (w^(w>>19))^(t^(t>>8)); // } // } // // this.x=x; this.y=y; this.z=z; this.w=w; // } #endregion #region Public Methods [Methods not present on System.Random] /// <summary> /// Generates a uint. Values returned are over the full range of a uint, /// uint.MinValue to uint.MaxValue, inclusive. /// /// This is the fastest method for generating a single random number because the underlying /// random number generator algorithm generates 32 random bits that can be cast directly to /// a uint. /// </summary> /// <returns></returns> public uint NextUInt() { uint t = (x ^ (x << 11)); x = y; y = z; z = w; return (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))); } /// <summary> /// Generates a random int over the range 0 to int.MaxValue, inclusive. /// This method differs from Next() only in that the range is 0 to int.MaxValue /// and not 0 to int.MaxValue-1. /// /// The slight difference in range means this method is slightly faster than Next() /// but is not functionally equivalent to System.Random.Next(). /// </summary> /// <returns></returns> public int NextInt() { uint t = (x ^ (x << 11)); x = y; y = z; z = w; return (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))); } // Buffer 32 bits in bitBuffer, return 1 at a time, keep track of how many have been returned // with bitBufferIdx. uint bitBuffer; uint bitMask = 1; /// <summary> /// Generates a single random bit. /// This method's performance is improved by generating 32 bits in one operation and storing them /// ready for future calls. /// </summary> /// <returns></returns> public bool NextBool() { if (bitMask == 1) { // Generate 32 more bits. uint t = (x ^ (x << 11)); x = y; y = z; z = w; bitBuffer = w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)); // Reset the bitMask that tells us which bit to read next. bitMask = 0x80000000; return (bitBuffer & bitMask) == 0; } return (bitBuffer & (bitMask >>= 1)) == 0; } #endregion }
The test scenario:
public delegate string RollDelegate(); private void Test() { List<string> rollMethodNames = new List<string>(){ "Large Lookup Fast Random UInt", "Large Lookup Fast Random", "Large Lookup Optimized Random", "Fastest Optimized Random Modded", "Numbers", "Large Lookup Parameterless Random", "Large Lookup", "Lookup Optimized Modded", "Fastest Optimized Modded", "Optimized Modded Const", "Optimized Modded", "Modded", "Simple", "Another simple with HashSet", "Another Simple", "Option (Compiled) Regex", "Regex", "EndsWith", }; List<RollDelegate> rollMethods = new List<RollDelegate>{ RollLargeLookupFastRandomUInt, RollLargeLookupFastRandom, RollLargeLookupOptimizedRandom, FastestOptimizedRandomModded, RollNumbers, RollLargeLookupParameterlessRandom, RollLargeLookup, RollLookupOptimizedModded, FastestOptimizedModded, RollOptimizedModdedConst, RollOptimizedModded, RollModded, RollSimple, RollSimpleHashSet, RollAnotherSimple, RollOptionRegex, RollRegex, RollEndsWith }; int trial = 10000000; InitLargeLookup(); for (int k = 0; k < rollMethods.Count; ++k) { rnd = new Random(10000); fastRnd = new FastRandom(10000); logBox.GetTimeLapse(); for (int i = 0; i < trial; ++i) rollMethods[k](); logBox.WriteTimedLogLine(rollMethodNames[k] + ": " + logBox.GetTimeLapse()); } }
The result (Prefer 32-Bit):
[2016-05-30 08:20:54.056 UTC] Large Lookup Fast Random UInt: 219 ms [2016-05-30 08:20:54.296 UTC] Large Lookup Fast Random: 238 ms [2016-05-30 08:20:54.524 UTC] Large Lookup Optimized Random: 228 ms [2016-05-30 08:20:54.810 UTC] Fastest Optimized Random Modded: 286 ms [2016-05-30 08:20:55.347 UTC] Numbers: 537 ms [2016-05-30 08:20:55.596 UTC] Large Lookup Parameterless Random: 248 ms [2016-05-30 08:20:55.916 UTC] Large Lookup: 320 ms [2016-05-30 08:20:56.231 UTC] Lookup Optimized Modded: 315 ms [2016-05-30 08:20:56.577 UTC] Fastest Optimized Modded: 345 ms [2016-05-30 08:20:57.049 UTC] Optimized Modded Const: 472 ms [2016-05-30 08:20:57.521 UTC] Optimized Modded: 471 ms [2016-05-30 08:20:58.017 UTC] Modded: 496 ms [2016-05-30 08:20:59.685 UTC] Simple: 1668 ms [2016-05-30 08:21:01.824 UTC] Another simple with HashSet: 2138 ms [2016-05-30 08:21:04.837 UTC] Another Simple: 3013 ms [2016-05-30 08:21:13.794 UTC] Option (Compiled) Regex: 8956 ms [2016-05-30 08:21:28.827 UTC] Regex: 15032 ms [2016-05-30 08:21:53.589 UTC] EndsWith: 24763 ms
The result (Non Prefer 32-Bit):
[2016-05-30 08:16:00.934 UTC] Large Lookup Fast Random UInt: 273 ms [2016-05-30 08:16:01.230 UTC] Large Lookup Fast Random: 294 ms [2016-05-30 08:16:01.503 UTC] Large Lookup Optimized Random: 273 ms [2016-05-30 08:16:01.837 UTC] Fastest Optimized Random Modded: 333 ms [2016-05-30 08:16:02.245 UTC] Numbers: 408 ms [2016-05-30 08:16:02.532 UTC] Large Lookup Parameterless Random: 287 ms [2016-05-30 08:16:02.816 UTC] Large Lookup: 284 ms [2016-05-30 08:16:03.145 UTC] Lookup Optimized Modded: 329 ms [2016-05-30 08:16:03.486 UTC] Fastest Optimized Modded: 340 ms [2016-05-30 08:16:03.824 UTC] Optimized Modded Const: 337 ms [2016-05-30 08:16:04.154 UTC] Optimized Modded: 330 ms [2016-05-30 08:16:04.524 UTC] Modded: 370 ms [2016-05-30 08:16:05.700 UTC] Simple: 1176 ms [2016-05-30 08:16:07.309 UTC] Another simple with HashSet: 1609 ms [2016-05-30 08:16:09.774 UTC] Another Simple: 2465 ms [2016-05-30 08:16:17.450 UTC] Option (Compiled) Regex: 7675 ms [2016-05-30 08:16:34.090 UTC] Regex: 16640 ms [2016-05-30 08:16:54.793 UTC] EndsWith: 20702 ms
And the picture:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With