difference between baseline and benchmark in performance of an application

2 Answers

In scientific research, a benchmark is a kind of test and a baseline is a kind of result.

Let's look at an example of a benchmark test: we might take a collection of 5,000 sentences in English and use the lab's four-core Dell machine to translate them into Spanish using various algorithms. Because we've kept the data and the machine constant, we can meaningfully compare the time taken by the different algorithms to complete the task, as well as their relative accuracy (measured against gold-standard human translations).

To find a baseline for this benchmark test, we might write a very naive translation algorithm that just finds the commonest translation for each individual word, with no regard for the context. Measuring the accuracy of this algorithm against our human translations gives us an idea of the minimum score - the baseline - that the others must beat, and gives us a feel for what level of accuracy counts as "good".

At the other end of the scale from a baseline, an upper bound is a useful yardstick too. In the translation example, we might find the upper bound by measuring the accuracy of one of our human translations with respect to the others. This gives us an idea of how high it's possible to get on our "accuracy" measure before you hit the ceiling of human disagreement. We expect our machine translation algorithms to perform at a level between the baseline and the upper bound.

185

answered Oct 23 '22 20:10

Tommy Herbert

Interesting definitions from SPR (Software Productivity Research)

Baseline and benchmark are similar but distinct activities.

Figuratively, a baseline is a "line in the sand" for an organization whereby it measures important performance characteristics for future reference.

This is not necessarily a "good" state", just a reference.

A benchmark is best understood by way of the original derivation of the word itself:

Tradesmen engaged in repetitive tasks, such as sawing lumber to consistent lengths, often placed notches on their workbenches to indicate placement of boards prior to cutting. Literally, a benchmark became a standard for comparison and an indicator of past success.

Basically:

baseline is about identification of a significant state, meaning your set of numbers met an approval status, publicly recognized.
a benchmark is about assessing the relative performance of an application.

answered Oct 23 '22 18:10

VonC

Related questions
                            
                                Are some CSS styles more "expensive" than others?
                            
                                Efficient way to implement an indexed queue (where elements can be retrieved by index in O(1) time)?
                            
                                Time Complexity of Genetic Algorithm
                            
                                How is data replaced in memcached when it is full, and memcache performance?
                            
                                tomcat7 vs. tomcat6, Is there any significant difference in their performance?
                            
                                Limit number of results in Meteor on the server side?
                            
                                position: fixed , has a very bad performance on mobile/tablet devices when scrolling
                            
                                Is it possible to compute an inverse of sparse matrix in Python as fast as in Matlab?
                            
                                Performance when combining specific page styles AND global style in the same page
                            
                                Will there be a performance hit on including unused header files in C/C++?
                            
                                Symfony 2 performance optimisations
                            
                                Improve performance of Highcharts line chart
                            
                                Integer.parseInt(scanner.nextLine()) vs scanner.nextInt()
                            
                                How can I analyse and reduce the size of my browserify generated files?
                            
                                WinDbg takes extremely long time to loading symbols; is searching every directory in large network UNC symbol store
                            
                                perfomance of len(List) vs reading a variable
                            
                                std::vector::erase vs "swap and pop"
                            
                                Why is static final slower than a new on each iteration
                            
                                Why does a java collecting stream run each getter twice?
                            
                                Does functional programming reduce the Von Neumann bottleneck?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

difference between baseline and benchmark in performance of an application

Tags:

performance

definition

gagneet

People also ask

2 Answers

Tommy Herbert

VonC

Recent Activity

Donate For Us