Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ideas and tips for temporal unit-testing?

Tags:

unit-testing

Has anyone done temporal unit-testing?

I'm not even sure if such lingo has been coined or not, but the point is to test that operations perform within temporal limits. I have a few algorithms and I want to test that their execution time increases as expected, and I guess similar testing could be used for IO, and what not, kind of like test_timeout or something.

However because the hardware affects the speed of execution it doesn't seem trivial. So I was wondering if anyone has tried this sort of thing before, and if they would could share their experience.

Thanks

Edit: Trying to compile a list of stuff that needs to be taken care of in this kind of situation

like image 280
Robert Gould Avatar asked Jan 27 '09 12:01

Robert Gould


1 Answers

Just some notes from my experience... We care about the performance of many of our components and have a very unittest-like framework to exercise and time them (with hindsight, we should have just used CppUnit or boost::test like we do for unittests). We call these "component benchmarks" rather than unittests.

  • We don't specify an upper limit on time and then pass/fail... we just log the times (this is partly related to a customer reluctance to actually give hard performance requirements, despite performance being something they care about a lot!). (We have tried pass/fail in the past and had a bad experience, especially on developer machines... too many false alarms because an email arrived or something was indexing in the background)
  • Developers working on optimisation can just work on getting the relevant benchmark times down without having to build a whole system (much the same as unittests let you focus on one bit of the codebase).
  • Most benchmarks test several iterations of iterations of something. Lazy creation of resources can mean the first use of a component can have considerably more "setup time" associated with it. We log out "1st", "average subsequent" and "average all" times. Make sure you understand the cause of any significant differences between these. In some cases we benchmark setup times explicitly as an individual case.
  • Ought to be obvious, but: just time the code you actually care about, not the test environment setup time!
  • For benchmarks you end up testing "real" cases a lot more than you do in unittests, so test setup and test runtime tend to be a lot longer.
  • We have an autotest machine run all the benchmarks nightly and post a log of all the results. In theory we could graph it or have it flag components which have fallen below target performance. In practice we haven't got around to setting anything like that up.
  • You do want such an autotest machine to be completely free of other duties (e.g if it's also your SVN server, someone doing a big checkout will make it look like you've had a huge performance regression).
  • Think about other scalar quantities you might want to benchmark besides time and plan to support them from the start. For example, "compression ratio achieved", "Skynet AI IQ"...
  • Don't let people do any analysis of benchmark data on sub minimum spec hardware. I've seen time wasted 'cos of a design descision made as a result of a benchmark run on someone's junk lappy, when a run on the target platform - a high end server - would have indicated something completely different!
like image 117
timday Avatar answered Nov 16 '22 02:11

timday