CERN/RD45/99/01

4 February, 1999

Revision 0

 

 

 

 

 

Persistent Object Manager Choices

 

A Discussion of the Options Regarding the Choice of a Persistent Object Manager for the Production Phase of the LHC

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Introduction

In this note, we discuss the possible alternatives for a persistent object manager for the production phase of the LHC experiments.

It is assumed that:

  1. Current Situation
  2. Since 1995, the RD45 project has been researching into the problem of finding a suitable product to meet the needs of the LHC experiments in the area of persistent object management. The initial milestones from the LCRB referees emphasised the importance of investigating standards-based solutions, as well as commercial products, such as Object Databases, and indeed this line of research clearly emerged as the most promising. For some time now, the recommendation of the RD45 collaboration has been based upon a combination of two commercial products – Objectivity/DB and HPSS – plus a small amount of HEP-specific code.

    Numerous milestones have been met using one or both of these products, including a number of demonstrations of federations in the 10GB – 1TB range, I/O rates ranging from 20MB/second – 150MB/second, and extensive functionality and scalability tests have been made. The conclusion of this work is that, from a functional point of view, Objectivity/DB continues to be the most suitable existing candidate on which to build a persistent object manager for HEP.

     

  3. The Prognosis
  4. In early 1996, a "statement of the probable capabilities of a HEP Persistent Object Manager based upon commercial ODBMS and large-market MSS" was produced, under the title "Object Databases and Mass Storage Systems: The Prognosis" [5]. This report discussed not only ODBMS and MSS alternatives, but also storage-related issues, such as filesystem capabilities, disk and tape trends, and so forth. Predictions concerning the latter are now covered by the PASTA technology tracking group, but were largely positive, and the predicted trends have been confirmed over the past few years. Although there are still questions concerning the high-end "tape" market, other areas that concern storage continue to look healthy.

    At the time that the Prognosis was written, there were numerous independent reports that anticipated that the ODBMS market would take off by the year 2000 (or even before), and that the high-end (hundreds of TB to several PB) would become an important market for several vendors. There were also some 10-12 vendors claiming ODMG-compliance, offering some level of vendor independence.

    Unfortunately, history has now shown that these predictions were too optimistic: the ODBMS market continues to be small – almost insignificant in comparison to the RDBMS market – and there are no convincing arguments that this will change in the foreseeable future. Although the number of projects using object technology and databases has grown significantly, the vast majority continue to use RDBMS products, with a variety of object-relational mappings. A number of ODBMS vendors have branched into other areas, or have been taken over. It is likely that this trend will continue.

  5. Using an RDBMS
  6. At first glance, a possible alternative would appear to be an RDBMS, together with an O-R mapping layer. However, such a solution does not seem to be a viable option today – although this may change in the future. Firstly, the underlying technology - e.g. ORACLE – does not offer sufficient scalability and performance as to satisfy the requirements of the LHC experiments. Secondly, the O-R mapping layer, if not invented in-house, brings with it the usual disadvantages associated with non-standard products from small software companies, whose long-term survival is far from guaranteed.

    At a recent workshop in SLAC, a representative from ORACLE stated that they would not attempt to address the multi-PB market until there was sufficient commercial demand. It is nevertheless likely that ORACLE will indeed address this market at some stage during the lifetime of the LHC, and may eventually produce their own O-R mapping layer or offer something closer to an ODMG interface. However, it is considered unlikely that a system based upon an RDBMS will be in the running at the time of the 2001/2002 decision described above.

  7. Alternative ODBMS Vendors
  8. From some time now, the primary alternative ODBMS vendor has been considered to be Versant. This appears to be the only other product offering sufficient scalability. A number of tests were performed with Versant at CERN, but these were confined to relatively small data volumes and trivial applications. However, it was felt that these studies gave us sufficient insight as to the differences in the interface and architecture to Objectivity/DB as to make larger-scale tests, e.g. with 1TB or more of data, unnecessary.

    Since the time of these tests, the future of Versant as a company has come under some doubt. Financially, the company does not appear to be strong and its marketing policy, which seems to bring it in direct competition with database giants such as ORACLE, offers little assurance of its long-term survival.

    As a conclusion, Versant is no longer considered a viable medium to long-term alternative to Objectivity/DB.

  9. Objectivity/DB

In order to ensure that Objectivity/DB offers sufficient functionality as to meet the requirements of the LHC experiments, our strategy has been to push for the early implementation of new features and architectural changes. These include:

All of these features are currently scheduled for the 5.2 release of Objectivity/DB, which was initially planned for end-1998. At the time of writing, we have still not received 5.1 for all platforms – it has been delivered for Sun and NT and a pre-release for Linux, which is still based on g++ and not egcs, has been received. Current estimates suggest that 5.2 might be delivered for Sun and NT as early as March 1999, although past experience has taught us to be very conservative when dealing with delivery dates from Objectivity. It is likely that we will receive the product during 1999 – still very largely in time for LHC production, although later than desirable for BaBar, COMPASS and NA45.

At a recent European Technical Forum, it emerged that there was very considerable overlap between outstanding enhancements required by HEP – beyond the main changes described above, and other customers of Objectivity. In other words, there is good reason to believe that many, if not the majority, of requests for new features will be general-purpose, and not HEP-specific.

Objectivity’s customer base continues to be dominated by two groups: telecoms, who provide the bulk of their revenue and are largely uninterested in support for VLDBs, and physics, including HEP, astrophysics, geophysics, plasmaphysics, who typically are more demanding in terms of support, but provide relatively little in terms of income.

As mentioned above, it is likely that Objectivity, in common with other previously independent ODBMS companies, will eventually be sold to a larger company. Although this might offer access to significantly more resources, it represents a discontinuity that inevitably has associated risks. Will the company continue to address the high-end? Will the product even continue to be marketed, or will it simply become an internal package, e.g. as part of IRIDIUM?

Given that there appears to be no obvious alternative, it is clearly important to identify – and if necessary build – a convincing fallback solution.

  1. Recommendations
  2. The provision of adequate data management software is clearly fundamental to the successful exploitation of the LHC. It is clearly essential that we need not only a primary candidate on which to build such a system, but also one or more convincing fallback solutions. Such solutions, whilst potentially offering somewhat less functionality than the preferred approach, must meet a number of mandatory requirements in terms of performance, scalability, and indeed maintainability. Given the relatively short amount of time available, it is clearly important that work on identifying and providing a convincing alternative to Objectivity/DB be given full priority. It is considered unlikely that a full-functioned, general-purpose database can be designed and built in the time available. However, a less ambitious system, built on the considerable knowledge gained over the past years in object technology and data management systems, catering for the specific needs of the HEP community, is clearly achievable. It is therefore planned that work start immediately to produce a new set of requirements, based upon the existing requirements established at the beginning of the RD45 project, and taking into account the experience gathered since that time. In parallel, existing components that could be integrated into a solution should be identified, such that various prototypes can be produced in the second half of 1999, and a functional, albeit still preliminary, system be made available in 2000. By mid-2001, existing performance, scalability and functionality tests that have been made using Objectivity/DB, including writing at least 1TB into the system, must be achieved.

     

     

  3. Conclusions
  4. The on-going risk analysis performed by the RD45 collaboration suggests that Versant can no longer be considered a viable alternative to Objectivity/DB, on which to base a data management system for the LHC. It is mandatory that a new alternative be identified, and if necessary built, respecting the milestones and deadlines concerned with the choice of a persistent object manager for the production phase of the LHC.

  5. References
  1. Assumptions concerning Distributed Data Management, CERN/RD45/9801, available via http://wwwinfo.cern.ch/asd/rd45/reports.htm.
  2. Guidelines for Setup of Production Objectivity/DB Federations, CERN/RD45/98/02, available via http://wwwinfo.cern.ch/asd/rd45/reports.htm.
  3. Guidelines for the use of Replication in Objectivity/DB, CERN/RD45/98/03, available via http://wwwinfo.cern.ch/asd/rd45/reports.htm.
  4. Wide-Area Objectivity/DB Federations, CERN/RD45/98/04, available via http://wwwinfo.cern.ch/asd/rd45/reports.htm.
  5. Object Databases and Mass Storage Systems: The Prognosis, the RD45 collaboration, CERN/LHCC 96-17, available via http://wwwinfo.cern.ch/asd/rd45/reports.htm.