SMILA 1.0 API documentation

org.eclipse.smila.importing.state.objectstore
Class ObjectStoreDeltaService

java.lang.Object
  extended by org.eclipse.smila.importing.state.objectstore.ObjectStoreDeltaService
All Implemented Interfaces:
DeltaService

public class ObjectStoreDeltaService
extends java.lang.Object
implements DeltaService

ObjectStore based implementation of the DeltaService for the jobmanager based importing framework.

Author:
scum36

Nested Class Summary
 
Nested classes/interfaces inherited from interface org.eclipse.smila.importing.DeltaService
DeltaService.EntryId
 
Field Summary
static java.lang.String BUNDLE_ID
          bundle ID for configuration area access.
static java.lang.String STORENAME
          objectstore store name.
 
Constructor Summary
ObjectStoreDeltaService()
           
 
Method Summary
protected  void activate(ComponentContext context)
          service activation.
 State checkState(java.lang.String sourceId, java.lang.String recordId, java.lang.String jobRunId, java.lang.String hashCode)
          Determine delta state of record identified by sourceId and recordId.
 State checkState(java.lang.String sourceId, java.lang.String recordId, java.lang.String compoundRecordId, java.lang.String jobRunId, java.lang.String hashCode)
          Determine delta state of record identified by sourceId and recordId.
 void clearAll()
          delete all state information in the service about all data sources.
 void clearSource(java.lang.String sourceId)
          delete all state information in the service about the given data source.
 long countEntries(java.lang.String sourceId, boolean countExact)
           
protected  void deactivate(ComponentContext context)
          service deactivation.
 void deleteEntry(java.lang.String sourceId, DeltaService.EntryId entryId)
          remove an entry, e.g. after it has been deleted.
 java.util.Collection<java.lang.String> getShardPrefixes(java.lang.String sourceId)
          get possible input values for #getRecordIdsToDelete(String).
 java.util.Collection<java.lang.String> getSourceIds()
          get Ids of all sources that currently have entries in the DeltaService.
 java.util.Collection<DeltaService.EntryId> getUnvisitedEntries(java.lang.String sourceAndShardPrefix, java.lang.String jobRunId)
          get the record IDs in the given data source and shard that have not been visited in the given job run and therefore must be sent as deleted records to the target job.
 void markAsUpdated(java.lang.String sourceId, java.lang.String recordId, java.lang.String jobRunId, java.lang.String hashCode)
          Mark the record as visited in the current crawl job run.
 void markAsUpdated(java.lang.String sourceId, java.lang.String recordId, java.lang.String compoundRecordId, java.lang.String jobRunId, java.lang.String hashCode)
          Mark the record that was extracted from a compound as visited in the current crawl job run.
 void markCompoundElementsVisited(java.lang.String sourceId, java.lang.String compoundRecordId, java.lang.String jobRunId)
          Set jobRunId of all elements of the given compound record, because the compound itself has not changed.
 void setObjectStore(ObjectStoreService objectStore)
          used by DS to set service reference.
 void unsetObjectStore(ObjectStoreService objectStore)
          used by DS to remove service reference.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BUNDLE_ID

public static final java.lang.String BUNDLE_ID
bundle ID for configuration area access.

See Also:
Constant Field Values

STORENAME

public static final java.lang.String STORENAME
objectstore store name.

See Also:
Constant Field Values
Constructor Detail

ObjectStoreDeltaService

public ObjectStoreDeltaService()
Method Detail

activate

protected void activate(ComponentContext context)
service activation.


deactivate

protected void deactivate(ComponentContext context)
service deactivation.


checkState

public State checkState(java.lang.String sourceId,
                        java.lang.String recordId,
                        java.lang.String jobRunId,
                        java.lang.String hashCode)
                 throws DeltaException
Description copied from interface: DeltaService
Determine delta state of record identified by sourceId and recordId. If the result is State.UPTODATE the service also marks the record as visited in the current crawl job run already, so there is no need to call DeltaService.markAsUpdated(String, String, String, String) afterwards. In the other cases the crawler should call DeltaService.markAsUpdated(String, String, String, String) only if the record is actually submitted to a processing job.

Specified by:
checkState in interface DeltaService
Parameters:
sourceId - the name of the data source that contains the record.
recordId - the record id
jobRunId - the current job run id in which the crawler is running.
hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if such is available in record metadata, or even a hash calculated on the actual content of the record.
Returns:
an appropriate State value.
Throws:
DeltaException

checkState

public State checkState(java.lang.String sourceId,
                        java.lang.String recordId,
                        java.lang.String compoundRecordId,
                        java.lang.String jobRunId,
                        java.lang.String hashCode)
                 throws DeltaException
Description copied from interface: DeltaService
Determine delta state of record identified by sourceId and recordId. If the result is State.UPTODATE the service also marks the record as visited in the current crawl job run already, so there is no need to call DeltaService.markAsUpdated(String, String, String, String) afterwards. In the other cases the crawler should call DeltaService.markAsUpdated(String, String, String, String) only if the record is actually submitted to a processing job.

Specified by:
checkState in interface DeltaService
Parameters:
sourceId - the name of the data source that contains the record.
recordId - the record id
compoundRecordId - the record id of the compound this record was extracted from. May be null.
jobRunId - the current job run id in which the crawler is running.
hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if such is available in record metadata, or even a hash calculated on the actual content of the record.
Returns:
an appropriate State value.
Throws:
DeltaException

markCompoundElementsVisited

public void markCompoundElementsVisited(java.lang.String sourceId,
                                        java.lang.String compoundRecordId,
                                        java.lang.String jobRunId)
                                 throws DeltaException
Description copied from interface: DeltaService
Set jobRunId of all elements of the given compound record, because the compound itself has not changed.

Specified by:
markCompoundElementsVisited in interface DeltaService
Throws:
DeltaException

markAsUpdated

public void markAsUpdated(java.lang.String sourceId,
                          java.lang.String recordId,
                          java.lang.String jobRunId,
                          java.lang.String hashCode)
                   throws DeltaException
Description copied from interface: DeltaService
Mark the record as visited in the current crawl job run.

Specified by:
markAsUpdated in interface DeltaService
Parameters:
sourceId - the name of the data source that contains the record.
recordId - the record id
jobRunId - the current job run id in which the crawler is running.
hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if such is available in record metadata, or even a hash calculated on the actual content of the record.
Throws:
DeltaException

markAsUpdated

public void markAsUpdated(java.lang.String sourceId,
                          java.lang.String recordId,
                          java.lang.String compoundRecordId,
                          java.lang.String jobRunId,
                          java.lang.String hashCode)
                   throws DeltaException
Description copied from interface: DeltaService
Mark the record that was extracted from a compound as visited in the current crawl job run.

Specified by:
markAsUpdated in interface DeltaService
Parameters:
sourceId - the name of the data source that contains the record.
recordId - the record id
compoundRecordId - the record id of the compound this record was extracted from. May be null.
jobRunId - the current job run id in which the crawler is running.
hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if such is available in record metadata, or even a hash calculated on the actual content of the record.
Throws:
DeltaException

clearSource

public void clearSource(java.lang.String sourceId)
                 throws DeltaException
Description copied from interface: DeltaService
delete all state information in the service about the given data source.

Specified by:
clearSource in interface DeltaService
Parameters:
sourceId - data source name.
Throws:
DeltaException

clearAll

public void clearAll()
              throws DeltaException
Description copied from interface: DeltaService
delete all state information in the service about all data sources.

Specified by:
clearAll in interface DeltaService
Throws:
DeltaException

getSourceIds

public java.util.Collection<java.lang.String> getSourceIds()
                                                    throws DeltaException
Description copied from interface: DeltaService
get Ids of all sources that currently have entries in the DeltaService.

Specified by:
getSourceIds in interface DeltaService
Throws:
DeltaException

countEntries

public long countEntries(java.lang.String sourceId,
                         boolean countExact)
                  throws DeltaException
Specified by:
countEntries in interface DeltaService
Parameters:
sourceId - the name of the data source to examine
countExact - set to true to get an exact reault, but this may take some time. Else the service may return only an estimated value.
Returns:
number of entries for given source id.
Throws:
DeltaException

getShardPrefixes

public java.util.Collection<java.lang.String> getShardPrefixes(java.lang.String sourceId)
                                                        throws DeltaException
Description copied from interface: DeltaService
get possible input values for #getRecordIdsToDelete(String). This makes it possible to parallelize and distribute the check for records to delete.

Specified by:
getShardPrefixes in interface DeltaService
Parameters:
sourceId - the name of the data source to examine.
Throws:
DeltaException

getUnvisitedEntries

public java.util.Collection<DeltaService.EntryId> getUnvisitedEntries(java.lang.String sourceAndShardPrefix,
                                                                      java.lang.String jobRunId)
                                                               throws DeltaException
Description copied from interface: DeltaService
get the record IDs in the given data source and shard that have not been visited in the given job run and therefore must be sent as deleted records to the target job. To get all unvisited records in the source the caller must iterate over the result of DeltaService.getShardPrefixes(String) and call this method with each of the shard-prefix values.

Specified by:
getUnvisitedEntries in interface DeltaService
Parameters:
sourceAndShardPrefix - one of the values returned by DeltaService.getShardPrefixes(String)
Returns:
Throws:
DeltaException

deleteEntry

public void deleteEntry(java.lang.String sourceId,
                        DeltaService.EntryId entryId)
                 throws DeltaException
Description copied from interface: DeltaService
remove an entry, e.g. after it has been deleted.

Specified by:
deleteEntry in interface DeltaService
Parameters:
sourceId - data source Id
entryId - ID of the entry, e.g. as returned by DeltaService.getUnvisitedEntries(String, String)
Throws:
DeltaException

setObjectStore

public void setObjectStore(ObjectStoreService objectStore)
used by DS to set service reference.


unsetObjectStore

public void unsetObjectStore(ObjectStoreService objectStore)
used by DS to remove service reference.


SMILA 1.0 API documentation