Using Review Resources Wisely: Avoiding "Benchmark Bogosity"

[ The PC Guide | The PC Buyer's Guide | Designing and Specifying PC Systems and Components | Key Performance Issues In PC System Design ]

Using Review Resources Wisely: Avoiding "Benchmark Bogosity"

When you go to shop for components and systems you will be able to take advantage of numerous resources that exist to help you assess hardware, manufacturers and vendors. I discuss these in some detail here. Many of the research resources that evaluate hardware components and systems include as part of their analysis benchmarks. A benchmark is a measurement that is intended to help indicate the performance level of a component, subsystem or system in a standardized way, so that comparisons can be made between similar hardware models, and help you decide what to buy.

Benchmarks are a useful indicator of performance--when used properly. When used improperly they are worse than useless, because they become deceptive and can lead to bad decision-making. When you see benchmarks stated in an online review, or mentioned in a manufacturer's literature, be sure to take them in the proper context. Always remember that many benchmark numbers--and the conclusions some reviewers and manufacturers draw from them--are highly suspect.

Here are some specific tips to keep in mind when considering hardware benchmarks of any sort, when reading a review or product literature:

System Benchmarks: Benchmarks are rarely run on entire systems, because the results vary based on the exact components in the system, and because the numbers aren't readily comparable between different systems. Be wary of any company that tries to boil the performance of an entire PC down to a single number. Unless you have that exact model the number doesn't tell you much.
Component Benchmarks: Most benchmarks are usually run on specific components, such as CPUs, hard drives and so on. These usually have some meaning; if CPU #1 benchmarks faster than CPU #2, it will likely result in some sort of performance improvement in your machine, if you change from #1 to #2. However, the improvement can vary greatly, depending on the other hardware in your machine. It is impossible to completely isolate the performance of one component from that of the others. If the system that the test was done on is very different from yours, you should not expect to see the same performance difference that the reviewers did when they compared #1 and #2.
What Is Being Measured: Find out all the details about any benchmark before deciding to take the results it gives seriously. What exactly is being measured, and how?
Relevance: A particular component may produce a benchmark that is 20% higher than the benchmark value of another component, but how relevant is that result? Each component only contributes to one aspect of overall performance, so these numbers do not translate directly in the real world. A component that is 20% faster than another will not result in the PC as a whole being 20% faster. It could be only 5% faster. In some cases it will make no noticeable difference at all.
Significance: You'll occasionally see a bunch of components of the same type compared, and the benchmarks scores of the units come out very close together. This commonly happens with motherboards that are put into systems that otherwise use identical hardware: there might be four boards benchmarked, with the fastest board 5% or less higher than the slowest. Sometimes a graph is displayed with the zero point cut off to "magnify" the relative differences in scores between the different devices. In such a case, even if the benchmark scores are valid, they are not significant. A general rule of thumb says that most users won't even notice a difference in overall performance of less than about 10%. If motherboard A is 2% faster than motherboard B, then the two can be considered roughly equivalent in terms of performance. (Heck, the margin of error in the benchmark program itself is probably higher than 2%.)
Proper Emphasis: Some reviews of hardware seem to focus almost entirely on performance benchmarks, providing page after page after page of graphs and tables. This represents plenty of data, but is any of it useful information? Benchmarks are often emphasized because they are easy to do: it is not hard to run a benchmark program and create tables from the results. It is much more difficult to assess what those results mean in a way that makes sense, or to evaluate the quality, reliability and other characteristics of a product.

By far the most important thing to look for in a review that uses benchmarks is the methodology used by the reviewers. Are they using the benchmark properly? Are they running each benchmark multiple times and averaging the results to reduce the chances of spurious numbers? Are they remembering to change only one variable at a time? This last one is important: the only way to even have a chance at comparing the performance of two different components of the same type is to use them in the identical system and test them under identical conditions.

The bottom line? Take benchmarks with a grain of salt. Use them, but keep them in their place, and don't succumb to the temptation to assign them more value than they are worth.

Tip: For more on benchmarking, see this article.

Next: PC System Balance

Home - Search - Topics - Up

Not responsible for any loss resulting from the use of this site.
Please read the Site Guide before using this material.