Table of Contents
Most of the time the performance of an histogramming package is not critical. For example, in the typical case of long running batch jobs, the time spent in histogram operations is not that important. On the other hand there are applications, such as online monitoring, where excellent performance is fundamental.
The HistOOgram package, which HTL replaces, was not optimised for performance, but rather was designed for maximum flexibility. In addition, early benchmarks of this package were performed on a pre-release and should be considered unrepresentative.
Experience shows that there is always a tradeoff between higher performance and maximum flexibility. Fortunately, a reasonable compromise can usually be found, since performance is normally required only in well defined areas of the code.
A often-heard rule of thumb states that in most cases an application spends 80!PERCENT! of its time in 20!PERCENT! of the code. Hence an efficient approach is to estimate where the critical sections of the code lie, identify the most appropriate algorithms (which are more difficult to change than code), and finally measure the performance with a proper tool, such as a code profiler. Code portability and maintainability should not be abandoned in the pursuit of performance - all are important issues that need to be addressed when producing a package.
The procedure described above was that used in the case of HTL. In other words, the critical portions of code were first identified, which unsurprisingly turned out to be the filling methods, that can be called millions of times. A technique to speed up filling by using templated classes was then identified (see for instance the Blitz++ libraries for a discussion about templates and C++ performance). Finally, once the package was working, the performance was measured and tuned using a simple code profiler.
The results of a simple comparison with HBOOK are presented below. These are not intended as a complete test but rather as a benchmark reference. The source code used for the benchmark is available on request.