MetaPHOR Project Report: Database

3.3.5. Data management

3.3.5.1. Introduction

The management of design data is a problem that CASE shares with e.g. Computer Aided Design (CAD). Typically, the amount of data stored is not massive, but the data itself is complicated. Therefore, well-known and widely-used relational databases are not necessarily optimal for the purpose. A more promising new technology is object-oriented databases, but many of these are not really of industrial strength yet, or are closely bound to a programming language such as C++ or Smalltalk. It is no wonder then, that many existing systems are based on the robust but cumbersome and somewhat dated technology of network (CODASYL type) databases.

Another problem with design databases is that of long transactions in a multi-user environment. Whilst a design process almost always involves several designers whose activities are closely intertwined, they should not be allowed to change each other's designs in an uncontrolled manner. On the other hand, extensive use of locking in the design data would lead to overheads, deadlocks, a lack of visibility of necessary data, or prevent some designers from doing useful work until others have completed theirs.

3.3.5.2. Objectives

In the context of MetaPHOR project, the main objectives are to overcome the aforementioned problems to a reasonable degree. It has not been possible to implement our own repository management system within the resource frame of the project: rather, we need to buy an existing database and extend it with new functionality where possible. Therefore we have been obliged to:

actively search for and test existing database systems to see how they would support CASE data management,

develop and test simple strategies for version control in design data (both over time and for different configurations),

develop and implement simple and user-oriented strategies for maintaining design data integrity, i.e. transaction protocols,

develop and implement an interface for design front-end tools so that they can store data in the database and retrieve it from the database using a more powerful high-level protocol.

3.3.5.3. Present situation

Three of the objectives above have received some attention:

First, an extensive survey of available object-oriented and network databases was carried out (Rissanen and Rossi 1992) based on which a database management system was chosen for test use. However, as the implementation environment of the project needed to be changed from Actor to Smalltalk, the issue had to be reassessed. This led to another study in which the choice was narrowed down to two possibilities, GemStone, an object-oriented database with an interface to Smalltalk, and ArtBASE, an object store that extends Smalltalk by the ability to make objects persistent (i.e. stored in the repository) and perform locking etc. actions on them. In the end, considerations of the possibilities for use in further research outweighed the other factors, and GemStone was rejected on the grounds that it always required a Unix machine as a server, whereas ArtBASE could run both server and client within a single PC. ArtBASE was also significantly less expensive, and promised more flexibility and a tighter, more dynamic link between the classes contained in memory and the database `schema': a vital factor for incremental method engineering.

ArtBASE has now been tested on both single and multi-user versions, although there has been no testing with many users accessing the same data simultaneously, as the locking strategy for tools is yet to be implemented. The operations on the conceptual GOPRR data now automatically produce the correct database actions, and changes of metamodels have been successfully tested. Such success is remarkable: previous relational, network, and object-oriented databases would have been unable to cope with such interference with the schema while the database was running.

A user-centred transaction protocol has been designed. This enables the program to infer from user actions which data should be locked, thus maintaining consistency and ensuring the possibility of transaction commit without transferring to the user the burden of explicitly declaring locks as his transaction proceeds. In the implementation it is intended to allow such explicit locking, although normal users should not need to use it.

Second, the master's project on version management problems has generated useful discussion and provided a possible design of a versioning system (Pyykkö, 1994). The implementation of this is currently postponed.

Third, a partial implementation of a high-level protocol for data exchange in the current MetaEdit environment was done by another master's project (Jukola and Väätäinen 1992). An alternative solution was also drafted, but not realised because of portability issues.

3.3.5.4. Future plans

Further tests will be run on ArtBASE, and the precise locking protocols for conceptual and representational information will be implemented into the MetaEngine and tools during autumn 1994. As the section on GOPRR implementation work showed, our research has indicated that centralised conceptual protocols are not always possible, and it may be necessary to design a separate locking protocol for each tool, considerably increasing the workload of programming the database, whilst also distributing the code over a wider area.

It is however hoped that the tools can call generic locking routines in the MetaEngine, rather than explicitly executing say a read lock. Thus it will be possible to control the locking strategy centrally, even if not to trigger it based only on actions on conceptual design elements. With the basic locking calls implemented, an analysis will be made of the efficiency of various locking strategies within a metaCASE environment.