I'm trying to create a program that parses data from game's chat log. So far I have managed to get the program to work and parse the data that I want but my problem is that the program is getting slower.
Currently it takes 5 seconds to parse a 10MB text file and I noticed it drops down to 3 seconds if I add RegexOptions.Compiled to my regex.
I believe I have pinpointed the problem to my regex matches. One line is currently read 5 times because of the 5 regexes so the program would get even slower when I add more later.
What should I do so my program would not slow down with multiple regexes? All suggestions to make the code better are appreciated!
if (sender.Equals(ButtonParse))
{
var totalShots = 0f;
var totalHits = 0f;
var misses = 0;
var crits = 0;
var regDmg = new Regex(@"(?<=\bSystem\b.* You inflicted )\d+.\d", RegexOptions.Compiled);
var regMiss = new Regex(@"(?<=\bSystem\b.* Target evaded attack)", RegexOptions.Compiled);
var regCrit = new Regex(@"(?<=\bSystem\b.* Critical hit - additional damage)", RegexOptions.Compiled);
var regHeal = new Regex(@"(?<=\bSystem\b.* You healed yourself )\d+.\d", RegexOptions.Compiled);
var regDmgrec = new Regex(@"(?<=\bSystem\b.* You take )\d+.\d", RegexOptions.Compiled);
var dmgList = new List<float>(); //New list for damage values
var healList = new List<float>(); //New list for heal values
var dmgRecList = new List<float>(); //New list for damage received values
using (var sr = new StreamReader(TextBox1.Text))
{
while (!sr.EndOfStream)
{
var line = sr.ReadLine();
var match = regDmg.Match(line);
var match2 = regMiss.Match(line);
var match3 = regCrit.Match(line);
var match4 = regHeal.Match(line);
var match5 = regDmgrec.Match(line);
if (match.Success)
{
dmgList.Add(float.Parse(match.Value, CultureInfo.InvariantCulture));
totalShots++;
totalHits++;
}
if (match2.Success)
{
misses++;
totalShots++;
}
if (match3.Success)
{
crits++;
}
if (match4.Success)
{
healList.Add(float.Parse(match4.Value, CultureInfo.InvariantCulture));
}
if (match5.Success)
{
dmgRecList.Add(float.Parse(match5.Value, CultureInfo.InvariantCulture));
}
}
TextBlockTotalShots.Text = totalShots.ToString(); //Show total shots
TextBlockTotalDmg.Text = dmgList.Sum().ToString("0.##"); //Show total damage inflicted
TextBlockTotalHits.Text = totalHits.ToString(); //Show total hits
var hitChance = totalHits / totalShots; //Calculate hit chance
TextBlockHitChance.Text = hitChance.ToString("P"); //Show hit chance
TextBlockTotalMiss.Text = misses.ToString(); //Show total misses
var missChance = misses / totalShots; //Calculate miss chance
TextBlockMissChance.Text = missChance.ToString("P"); //Show miss chance
TextBlockTotalCrits.Text = crits.ToString(); //Show total crits
var critChance = crits / totalShots; //Calculate crit chance
TextBlockCritChance.Text = critChance.ToString("P"); //Show crit chance
TextBlockDmgHealed.Text = healList.Sum().ToString("F1"); //Show damage healed
TextBlockDmgReceived.Text = dmgRecList.Sum().ToString("F1"); //Show damage received
var pedSpent = dmgList.Sum() / (float.Parse(TextBoxEco.Text, CultureInfo.InvariantCulture) * 100); //Calculate ped spent
TextBlockPedSpent.Text = pedSpent.ToString("0.##") + " PED"; //Estimated ped spent
}
}
And here's a sample text:
2014-09-02 23:07:22 [System] [] You inflicted 45.2 points of damage.
2014-09-02 23:07:23 [System] [] You inflicted 45.4 points of damage.
2014-09-02 23:07:24 [System] [] Target evaded attack.
2014-09-02 23:07:25 [System] [] You inflicted 48.4 points of damage.
2014-09-02 23:07:26 [System] [] You inflicted 48.6 points of damage.
2014-10-15 12:39:55 [System] [] Target evaded attack.
2014-10-15 12:39:58 [System] [] You inflicted 56.0 points of damage.
2014-10-15 12:39:59 [System] [] You inflicted 74.6 points of damage.
2014-10-15 12:40:02 [System] [] You inflicted 78.6 points of damage.
2014-10-15 12:40:04 [System] [] Target evaded attack.
2014-10-15 12:40:06 [System] [] You inflicted 66.9 points of damage.
2014-10-15 12:40:08 [System] [] You inflicted 76.2 points of damage.
2014-10-15 12:40:12 [System] [] You take 18.4 points of damage.
2014-10-15 12:40:14 [System] [] You inflicted 76.1 points of damage.
2014-10-15 12:40:17 [System] [] You inflicted 88.5 points of damage.
2014-10-15 12:40:19 [System] [] You inflicted 69.0 points of damage.
2014-10-19 05:56:30 [System] [] Critical hit - additional damage! You inflict 275.4 points of damage.
2014-10-19 05:59:29 [System] [] You inflicted 92.8 points of damage.
2014-10-19 05:59:31 [System] [] Critical hit - additional damage! You inflict 251.5 points of damage.
2014-10-19 05:59:35 [System] [] You take 59.4 points of damage.
2014-10-19 05:59:39 [System] [] You healed yourself 84.0 points.
The reason the regex is so slow is that the "*" quantifier is greedy by default, and so the first ". *" tries to match the whole string, and after that begins to backtrack character by character. The runtime is exponential in the count of numbers on a line.
Being more specific with your regular expressions, even if they become much longer, can make a world of difference in performance. The fewer characters you scan to determine the match, the faster your regexes will be.
My experience shows that most of the time developers focus on correctness of a regex, leaving aside its performance. Yet matching a string with a regex can be surprisingly slow. So slow it can even stop any JS app or take 100% of a server CPU time causing denial of service (DOS).
Regular expression simplification is a method for removing unnecessary elements from certain regular expressions in order to simplify, minimize or make it more readable by analyzing the patterns that make up the regex string. Feel free to edit this Q&A, review it or improve it!
Regex has an interpreted mode and a compiled mode. The compiled mode takes longer to start, but is generally faster.
Here are the issues as I see it
The following is a one regex pattern solution which works on a line by line basis. Its first task is to verify that [System]
is contained on the line. If it is not, it does no matching on that line. If it does have system, then it looks for specific keywords and possible values and places them into regex named match captures
in a key/value pair situation.
Once that is done using linq it will sum up the values found. Note that I have commented the pattern and had the regex parser ignore it.
string pattern = @"^ # Beginning of line to anchor it.
(?=.+\[System\]) # Within the line a literal '[System]' has to occur
(?=.+ # Somewhere within that line search for these keywords:
(?<Action> # Named Match Capture Group 'Action' will hold a keyword.
inflicte?d? # if the line has inflict or inflicted put it into 'Action'
| # or
evaded # evaded
| take # or take
| yourself # or yourself (heal)
)
(\s(?<Value>[\d.]+))?) # if a value of points exist place into 'Value'
.+ # match one or more to complete it.
$ #end of line to stop on";
// IgnorePatternWhiteSpace only allows us to comment the pattern. Does not affect processing.
var tokens =
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline)
.OfType<Match>()
.Select( mt => new {
Action = mt.Groups["Action"].Value,
Value = mt.Groups["Value"].Success ? double.Parse(mt.Groups["Value"].Value) : 0,
Count = 1,
})
.GroupBy ( itm => itm.Action, // Each action will be grouped into its name for summing
itm => itm, // This is value to summed amongst the individual items of the group.
(action, values) => new
{
Action = action,
Count = values.Sum (itm => itm.Count),
Total = values.Sum(itm => itm.Value)
}
);
The linq result returns each of the tokens as an entity which sums up all the values for the actions, but also counts up the number of times those actions occurred.
string data=@"2014-09-02 23:07:22 [System] [] You inflicted 45.2 points of damage.
2014-09-02 23:07:23 [System] [] You inflicted 45.4 points of damage.
2014-09-02 23:07:24 [System] [] Target evaded attack.
2014-09-02 23:07:25 [System] [] You inflicted 48.4 points of damage.
2014-09-02 23:07:26 [System] [] You inflicted 48.6 points of damage.
2014-10-15 12:39:55 [System] [] Target evaded attack.
2014-10-15 12:39:58 [System] [] You inflicted 56.0 points of damage.
2014-10-15 12:39:59 [System] [] You inflicted 74.6 points of damage.
2014-10-15 12:40:02 [System] [] You inflicted 78.6 points of damage.
2014-10-15 12:40:04 [System] [] Target evaded attack.
2014-10-15 12:40:06 [System] [] You inflicted 66.9 points of damage.
2014-10-15 12:40:08 [System] [] You inflicted 76.2 points of damage.
2014-10-15 12:40:12 [System] [] You take 18.4 points of damage.
2014-10-15 12:40:14 [System] [] You inflicted 76.1 points of damage.
2014-10-15 12:40:17 [System] [] You inflicted 88.5 points of damage.
2014-10-15 12:40:19 [System] [] You inflicted 69.0 points of damage.
2014-10-19 05:56:30 [System] [] Critical hit - additional damage! You inflict 275.4 points of damage.
2014-10-19 05:59:29 [System] [] You inflicted 92.8 points of damage.
2014-10-19 05:59:31 [System] [] Critical hit - additional damage! You inflict 251.5 points of damage.
2014-10-19 05:59:35 [System] [] You take 59.4 points of damage.
2014-10-19 05:59:39 [System] [] You healed yourself 84.0 points.";
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With