Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I Unit Test for relative performance?

Given that I don't know at deployment time what kinds of system my code will be running on, how do I write a Performance benchmark that uses the potential of a system as its yardstick.

What I mean is that if a system is capable of running the piece of code 1000 times per second, I'd like the test to ensure that is comes under as close to 1000 as possible. If it can only do 500, then that's the rate I'd like to compare it against.

If it helps in making the answer more specific, I'm using JUnit4.

Thank you.

like image 895
Allain Lalonde Avatar asked Feb 20 '26 11:02

Allain Lalonde


2 Answers

I would not use unit testing for performance tests for a couple of reasons.

First, unit tests should not have dependencies to the surrounding system/code. Performance tests depend heavily on the hardware/OS, so it is hard to get uniform measures that will be usable on both developer workstations, build server etc.

Second, unit tests should execute really fast. When you do performance tests, you usually want to have quite large data sets and repeat the number of runs a couple of times in order average numbers/get rid of overhead and so forth. These all work against the idea of fast tests.

like image 92
Brian Rasmussen Avatar answered Feb 22 '26 01:02

Brian Rasmussen


A test means you have a pass/fail threshold. For a performance test, this means too slow and you fail, fast enough and you pass. If you fail, you start doing rework.

If you can't fail, then you're benchmarking, not actually testing.

When you talk about "system is capable of running" you have to define "capable". You could use any of a large number of hardware performance benchmarks. Whetstone, Dhrystone, etc., are popular. Or, perhaps you have a database-intensive application, then you might want to look at the TPC benchmark. Or, perhaps you have a network-intensive application and want to use netperf. Or a GUI-intensive application and want to use some kind of graphics benchmark.

Any of these give you some kind of "capability" measurement. Pick one or more. They're all good. Equally debatable. Equally biased toward your competitor and away from you.

Once you've run the benchmark, you can then run your software and see what the system actually does.

You could -- if you gather enough data -- establish some correlation between some benchmark numbers and your performance numbers. You'll see all kinds of variation based on workload, hardware configuration, OS version, virtual machine, DB server, etc.

With enough data from enough boxes with enough different configurations, you will eventually be able to develop a performance model that says "given this hardware, software, tuning parameters and configuration, I expect my software to do [X] transactions per second." That's a solid definition of "capable".

Once you have that model, you can then compare your software against the capability number. Until you have a very complete model, you don't really know which systems are even capable of running the piece of code 1000 times per second.

like image 23
S.Lott Avatar answered Feb 21 '26 23:02

S.Lott