Table of Contents
Ideally, the hierarchical structure of the data storage should be completely transparent to the average user. Therefore, the Anaphe Team has done its best to hide the impact of the database on the C++ user code to a minimum. However, it is best that the users of the modules are aware of some basic principles, and how they relate to the experimental data model used by the various analysis programs.
Let we have a look at Figure Figure 4.1.. It shows the storage hierarchy used to store event data at the left, together with the user's view of these data at the right.
We start with the user's view (right hand side of the picture). The user likes to think in terms of events (the octagons), and wants to deal with, for instance reconstructed tracks (the triangles), hits in the forward calorimeter (the diamonds), or the calibration for the TPC (the pentagons), etc. Users should not be directly concerned (apart perhaps for efficiency considerations) how these various data elements are actually stored in files and distributed over a network. They prefer to have a logical view of their event and navigate between its various componenents in a transparent way. It is up to the data administrator to make sure that the data are stored in a way optimising performance and throughput for the end user.
This is possible using an object oriented database system, such as Objectivity/DB (left hand side of the picture). All data are kept in one federated database, which is basically just a file containing the catalog of the database files and the hostnames where they reside. It also contains the schema (object model) used by the data in the various databases.
The databases themselves are also separate files, which can reside on different nodes and they can consist of multiple containers, that can be thought of a contiguous areas on a file.
Finally, each container consists of one or more persistent objects (e.g., histograms, reconstructed tracks, fits). As seen in the picture, the mapping of the event to its components is very flexible, allowing different parts of an event to reside in different containers, and/or databases (even on remote nodes). Moreover, since the end users only access the full data through the logical structure, they are never affected by changes in the physical layout of the database.