Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to use Benchmark.NET to "fail" a CI build if performance has regressed too much?

I have unit tests. If one of them fails, my build fails.

I would like to apply the same principle to performance. I have a series of microbenchmarks for several hot paths through a library. Empirically, slowdowns in these areas have a disproportionate effect on the library's overall performance.

It would be nice if there were some way to have some concept of a "performance build" that can fail in the event of a too-significant performance regression.

I had considered hard-coding thresholds that must not be exceeded. Something like:

Assert.IsTrue(hotPathTestResult.TotalTime <= threshold)

but pegging that to an absolute value is hardware and environment-dependent, and therefore brittle.

Has anyone implemented something like this? What does Microsoft do for Kestrel?

like image 787
rianjs Avatar asked May 29 '18 14:05

rianjs


1 Answers

I would not do this via unit-tests -- it's the wrong place. Do this in a build/test-script. You gain more flexibility and can do a lot of more things that may be necessary.

A rough outline would be:

  1. build
  2. run unit tests
  3. run integration tests
  4. run benchmarks
  5. upload benchmark results to results-store (commercial product e.g. "PowerBI")
  6. check current results with previous results
  7. upload artefacts / deploy packages

On 6. if there is a regression you can let the build fail with non-zero exit-code.
BenchmarkDotNet can export results as JSON, etc., so you can take advantage of that.

The point is how to determine if a regression occures. Espcecially on CI builds (with containers, and that like) there may be different hardware on different benchmark-runs, so the results are not 1:1 comparable, and you have to take this into account.
Personally I don't let the script fail in case of a possible regression, but it sends an information about that, so I can manually check if it's a true regression or just a cause by different hardware.

Regression is simply detected if the current results are worse than the median of the last 5 results. Of course this is a rough method, but an effective one and you can tune that to your needs.

like image 163
gfoidl Avatar answered Nov 15 '22 13:11

gfoidl