SMILA 1.0 API documentation

org.eclipse.smila.importing
Interface DeltaService

All Known Implementing Classes:
ObjectStoreDeltaService

public interface DeltaService

Service interface for checking if a crawled record must be sent to the processing job.


Method Summary
 State checkState(java.lang.String sourceId, java.lang.String recordId, java.lang.String jobRunId, java.lang.String hashCode)
          Determine delta state of record identified by sourceId and recordId.
 void clearAll()
          delete all state information in the service about all data sources.
 void clearSource(java.lang.String sourceId)
          delete all state information in the service about the given data source.
 long countEntries(java.lang.String sourceId, boolean countExact)
           
 java.util.Collection<java.lang.String> getSourceIds()
          get Ids of all sources that currently have entries in the DeltaService.
 void markAsUpdated(java.lang.String sourceId, java.lang.String recordId, java.lang.String jobRunId, java.lang.String hashCode)
          Mark the record as visited in the current crawl job run.
 

Method Detail

checkState

State checkState(java.lang.String sourceId,
                 java.lang.String recordId,
                 java.lang.String jobRunId,
                 java.lang.String hashCode)
                 throws DeltaException
Determine delta state of record identified by sourceId and recordId. If the result is State.UPTODATE the service also marks the record as visited in the current crawl job run already, so there is no need to call markAsUpdated(String, String, String, String) afterwards. In the other cases the crawler should call markAsUpdated(String, String, String, String) only if the record is actually submitted to a processing job.

Parameters:
sourceId - the name of the data source that contains the record.
recordId - the record id
jobRunId - the current job run id in which the crawler is running.
hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if such is available in record metadata, or even a hash calculated on the actual content of the record.
Returns:
an appropriate State value.
Throws:
DeltaException

markAsUpdated

void markAsUpdated(java.lang.String sourceId,
                   java.lang.String recordId,
                   java.lang.String jobRunId,
                   java.lang.String hashCode)
                   throws DeltaException
Mark the record as visited in the current crawl job run.

Parameters:
sourceId - the name of the data source that contains the record.
recordId - the record id
jobRunId - the current job run id in which the crawler is running.
hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if such is available in record metadata, or even a hash calculated on the actual content of the record.
Throws:
DeltaException

clearSource

void clearSource(java.lang.String sourceId)
                 throws DeltaException
delete all state information in the service about the given data source.

Parameters:
sourceId - data source name.
Throws:
DeltaException

clearAll

void clearAll()
              throws DeltaException
delete all state information in the service about all data sources.

Throws:
DeltaException

getSourceIds

java.util.Collection<java.lang.String> getSourceIds()
                                                    throws DeltaException
get Ids of all sources that currently have entries in the DeltaService.

Throws:
DeltaException

countEntries

long countEntries(java.lang.String sourceId,
                  boolean countExact)
                  throws DeltaException
Parameters:
countExact - set to true to get an exact reault, but this may take some time. Else the service may return only an estimated value.
Returns:
number of entries for given source id.
Throws:
DeltaException

SMILA 1.0 API documentation