I reviewed a colleagues code and told him to reorder the boolean comparisons in the following Linq Any
predicate for performance reasons. So given
public class JobResult
{
public JobResult();
public string Id{ get; set; }
public StatusEnum Status{ get; set; }
public string JobType{ get; set; }
}
and
IList<JobResult> jobsList = _jobRepository.FetchJobs()
I suggested changing the following:
//Exit if there is already a job of type "PurgeData" running
if (jobsList.Any(job => job.Status == JobStatus.Running //1
&& job.Id != currentJobId //2
&& job.JobType == "PurgeData")) //3
return false;
to become
//Exit if there is already a job of type "PurgeData" running
if (jobsList.Any(job => job.JobType == "PurgeData" //3
&& job.Status == JobStatus.Running //1
&& job.Id != currentJobId)) //2
return false;
My reasoning was that most of the jobs in jobsList
fail the test for JobType
, only a few will fail the test for Running
and only one will fail the test for Id
. If a match fails there is no point evaluating the others and because of sequence points this will not occur.
My three part question is : Is this true, is it provably true and is there a better explanation I can give to my colleague for why reordering is a good idea?
Order of evaluation refers to the operator precedence and associativity rules according to which mathematical expressions are evaluated.
Expression evaluation in C is used to determine the order of the operators to calculate the accurate output. Arithmetic, Relational, Logical, and Conditional are expression evaluations in C.
There is no concept of left-to-right or right-to-left evaluation in C, which is not to be confused with left-to-right and right-to-left associativity of operators: the expression f1() + f2() + f3() is parsed as (f1() + f2()) + f3() due to left-to-right associativity of operator+, but the function call to f3 may be ...
My reasoning was that most of the jobs in jobsList fail the test for JobType, only a few will fail the test for Running and only one will fail the test for Id. If a match fails there is no point evaluating the others and because of sequence points this will not occur. Is this true?
Is true that the second and third predicates will not be evaluated when the first is false? Yes. Your reasoning is correct there.
Is it true that avoiding evaluation of the second and third predicates when the first is false is a clear performance win? Not necessarily, for two reasons.
First, not all comparisons are equally expensive. Comparing the strings "PurgeData" and "PurgeDatz" requires comparing eight characters before bailing out; comparing the integers is cheaper. It may be in the average case cheaper to avoid the string comparison even though it is more likely to be false.
Second, remember, avoiding running code eliminates the cost of that code, but you had to write code in order to test whether the other code should be avoided. A test is cheap but it is not free! There are situations in which avoiding code is actually more expensive than simply running it.
See my recent article on the subject:
http://ericlippert.com/2015/11/02/when-would-you-use-on-a-bool/
is there a better explanation I can give to my colleague for why reordering is a good idea?
Yes. You can set a performance metric and a realistic, important customer-focused performance goal, you can demonstrate empirically that the code one way fails to meet your goal when measured according to the metric, and you can show empirically that the code meets your goal when you write it another way.
If you don't do that then what you are describing is a difference you can't measure and no one cares about; your colleague would be quite right to tell you to stop wasting time changing working code in order to make a difference you can't measure that no one cares about.
Personally I wouldn't mind this too much. The overhead of using LINQ is much more than the comparison itself. And if the first predicate evaluates to false, all others are ignored, so testing the least likely first is a good idea.
Going on the expensiveness of evaluation, enums are integers, so comparing an enum is cheaper than evaluating a string as a whole. The second evaluation is an integer too, so the less than comparing a string. I thought string
compared length first, so that would be an integer comparison too, comparing each character costs more. So if the string is equally long as where it is compared to, the evaluation of string is heavier than integers. Else it doesn't matter too much.
My reasoning was that most of the jobs in jobsList fail the test for JobType, only a few will fail the test for Running and only one will fail the test for Id. If a match fails there is no point evaluating the others and because of sequence points this will not occur.
Yes, the ||
and &&
operators short-circut in C#. Meaning, they will not keep evaluating other expressions if the condition has already been met. Given that JobType
evaluates to false
, the other predicates will not be calculated. Note that your code may have additional overhead which outweigh the re-ordering of those predicates, such as comparison of an int
vs a string
(where the latter will most likely to be more expensive).
This does fall into the category of micro-optimizations, and other than explaining what short-circuiting is to your colleague, I suggest you actually test this code to see if it provides any performance benefit at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With