Developing Custom Stores

Explains how to integrate arbitrary data storage back-ends with the CDO Model Repository framework.

The CDO model repository framework does not make many assumptions on the type of data storage back-end used to store models and object graphs. A CDO repository communicates with a concrete back-end through an implementation of the IStore interface. By providing a custom store a repository can either talk to new back-end types or talk differently to already supportedback-end types.

To develop a custom store:

  1. Choose an appropriate base class.
    • Extend LongIDStore if your objects are to be identified by long integer values. In this case your new objects will automatically be assigned an ID instance of the class CDOIDLongImpl with increasing values (starting with 1).
    • Extend the Store class if you want to control the way your objects are to be identified. You can provide your own implementation of the CDOIDObject interface and you need to provide an object factory, a library descriptor and a library provider. See LongIDStore for an example. Keep in mind that the actual values of CDOIDs must not change at any time after the object entered state NEW!
  2. Support certain repository capabilities.
    • Define the change formats supported in the processing of commit operations. ChangeFormat.REVISION indicates that your store is able process object changes as snapshots taken after the modification (called a revision). ChangeFormat.DELTA indicates that your store is able process object changes as a set of deltas that represent the modification itself. If both are supported the repository may decide which format to use.
    • Define the revision temporality supported by the store. RevisionTemporality.NONE indicates that the store can deal with the repository auditing configuration switched off, i.e. old revisions are not preserved and can not be restored. RevisionTemporality.AUDITING indicates that the store can deal with repositories configured for auditing, i.e. old revisions (or deltas to restore them) are preserved and can be restored at any later time. It's possible to support only one or both modes.
    • Define the revision parallelism supported by the store. Currently the framework only supports RevisionParallelism.NONE. In the future the framework may also support RevisionParallelism.BRANCHING.
  3. Manage some store system properties.
    • Provide the creation time of the repository. Eventually the repository will ask for the creation time. Hence your store must save the time of the first succesful activation somewhere and later provide it through getCreationTime().
    • Answer if the current process is the first one for a given repository instance in a back-end. True indicates that the creation time has been saved and false indicates that it has been loaded.
    • You also need to load and save the latest CDOID values for objects and meta objects to be able to create new IDs for new objects later.
  4. Provide read/write access the back-end.
    • Read/write access to the back-end is provided through your implementation of the IStoreAccessor interface. If your store extends the LongIDStore your store accessors must also extend LongIDStoreAccessor, StoreAccessor otherwise.
    • Instances of your store accessor are created in the createReader() and createWriter() methods of your store implementation. Readers can be bound to ISessions. Writers can be bound to ITransactions.
    • Store accessor instances can take part in store accessor pooling. Return an instance of StoreAccessorPool from the getReaderPool() method and/or the getWriterPool() method of your store implementation. Returning null indicates that no pooling takes place.
    • Store accessors usually open and maintain some sort of physical connection to the particular back-end instance. For this purpose the lifecycle methods doActivate(), doDeactivate(), doPassivate() and doUnpassivate() can be used.
  5. Manage persistence of model elements
    • EPackages and their contained elements are EModelElements. EPackages can be nested and form a containment hierarchy with one top-level package. Such a package tree with at least one package relates to a CDOPackageUnit, which is the unit of model information that can be transfered and stored by CDO as a whole. Package units contain one CDOPackageInfo per contained EPackage. Package infos and their package are associated by their namespace URI. The ID of the package unit is the namespace URI of the top-level package (info).
    • At each startup time the repository creates a store accessor and calls readPackageUnits(). The store accessor is supposed to return a collection of package units with PROXY state. That means that the all package infos must be properly populated (i.e. read from the back-end) but the related EPackages need not be loaded yet. This facilitates lazy loading of the packages while enough information about them is available at runtime. The package registry of the repository is populated with this information as package descriptors, which resolve on demand to the related packages.
    • Whenever a package descriptor in the repository's package registry is to be resolved the loadPackageUnit() method of the store accessor is called. The package unit to be loaded is passed and the store accessor is supposed to return an array of the contained EPackage instances. The implementor can use the EMFUtil.createEPackage() method to deserialize an EPackage instance from a byte array that was initially created with EMFUtil.getEPackageBytes().
    • New package units are always added to the back-end as part of committing a transaction (the only exception being the two system packages which are added as part of the repository initialization). You must ensure that the package unit data, including all package infos and all packages, is written to the back-end in a way that the preceding two functions can do their work as expected.
  6. Manage persistence of objects
    • Objects are represented in a repository as chains of CDORevision instances. These chains are identified by their CDOID, the revisions are identified by their CDOID and an integer version. The notion of an object is not explicit in the repository! Beside CDOID and version values each revision remembers it creation time and, if it's not the latest revision, its revised time.
    • Whenever the repository needs to access a revision that is not present in its revision cache this revision is loaded from the back-end through one of the readRevisionXYZ() methods of the store accessor. There are three such methods to implement. All of them are passed the CDOID.
    • The readRevision() method is supposed to deliver the latest revision for the given CDOID, i.e. the one with the highest version or the one with the revised timestamp being CDORevision.UNSPECIFIED_DATE (both criteria are equivalent).
    • The readRevisionByVersion() method is supposed to deliver the particular revision for the given CDOID which is identified by the given version value.
    • The readRevisionByTime() method is supposed to deliver the particular revision for the given CDOID which is valid at the given timestamp value. I.e. the created/revised interval must include the given timestamp. Consider that the revised timestamp of a revision can be CDORevision.UNSPECIFIED_DATE. The implementation of this method is only required if the store implementation supports auditing!
    • If the store accessor implementation needs to read more revisions from the back-end than the one being requested by the repository, the additional ones can be cached in the repository for possible later usage through the passed AdditionalRevisionCache.
    • The referenceChunk parameter can be ignored if the store is not supposed to support partial collection loading (see below). If partial collection loading is to be supported the referenceChunk parameter gives the number of collection elements to load into the requested revision (for each of the many-valued structural features of the revision). All collection elements that are not loaded must be set to InternalCDORevision.UNINITIALIZED. If the store accessor uses the InternalCDORevision.UNINITIALIZED special guard value it must also provide an implementation of an IStoreChunkReader (see below) to load the missing elements on later demand.
    • New revisions are always added to the back-end as part of committing a transaction. They correspond to new objects if their version is 1. It depends on the ChangeFormat capability (see above) whether the data of changed objects is passed via writeRevisions() or via writeRevisionDeltas() to the store accessor. In either case the data must be written to the back-end so that the store accessor can read it back through the readRevisionXYZ() methods. All writeXYZ() methods support progress monitoring through the passed instance of OMMonitor.
    • Ensure that you apply special processing to objects of type CDOResource and CDOResourceFolder in order to be able to answer the queryReources() method (see below).
  7. Support partial collection loading
    • Partial collection loading is an optional feature. If it is supported by the store implementation (i.e. if the store accessor can fill InternalCDORevision.UNINITIALIZED guard values into collections read through the readRevisionXYZ() methods) the store accessor must return an instance of IStoreChunkReader from the createChunkReader() method of the store accessor.
    • If the chunk reader extends StoreChunkReader it only needs to implement the methods addSimpleChunk(), addRangedChunk() and executeRead(). See DBStoreChunkReader for an example.
  8. Support browsing the resource / folder structure
    • This non-optional feature is implemented in the queryResources() method of the store accessor. A QueryResourcesContext is passed by the framework which can be used to get the query values and to push the query results into.
    • In a CDO repository resources can exist at the root level or in resource folders, which themselves can exist at the root level or in other resource folders. All resources and resource folders are directly or indirectly contained by a single root resource. Resources and folders are normal objects that are committed to the repository in the scope of transactions.
    • A resource query requests all resources with a given name in a given folder. These values can be obtained from the QueryResourcesContext through the getFolderID() and the getName() method. CDOID.NULL as the folder ID indicates that the root resource is the direct container of the queried resources. If the name is null all contained resources must be pushed into the query context. If exactMatch is false the name returned by the query context has to be interpreted as a name prefix of the resource nodes. The maximum number of resource nodes to add to the context can be determined at any time by calling the getMaxResults() method of the context or implicitely by using the boolean return value of addResource() method (false means stop). If the store supports auditing it might be necessary to consider the time the query was started (more exact: the target time of the view that created the resource query), which can also be obtained from the context.
  9. Support back-end specific query languages
    • IStoreAccessor extends IQueryHandler, hence the implementor can support arbitrary query languages understood by the back-end to be integrated. This feature is optional. An IllegalArgumentException can be thrown to indicate that a certain query language is not supported or an UnsupportedOperationException can be thrown to indicate that query handling is not supported at all.
    • The CDOQueryInfo passed into the executeQuery() method gives access to the language of the query, the definition string and a map of named parameter values as well as the maximum number of query results to add to the query context.
    • The IQueryContext passed into the executeQuery() method is mostly used to add the query results to. As long as the addResult() method return true the implementor is supposed to look for more query results.
Related concepts
Queries
Object Identity