public interface DeltaService
| Modifier and Type | Interface and Description |
|---|---|
static class |
DeltaService.EntryId
returned by
getUnvisitedEntries(String, String). |
| Modifier and Type | Method and Description |
|---|---|
State |
checkState(java.lang.String sourceId,
java.lang.String recordId,
java.lang.String jobRunId,
java.lang.String hashCode)
Determine delta state of record identified by sourceId and recordId.
|
State |
checkState(java.lang.String sourceId,
java.lang.String recordId,
java.lang.String compoundRecordId,
java.lang.String jobRunId,
java.lang.String hashCode)
Determine delta state of record identified by sourceId and recordId.
|
void |
clearAll()
delete all state information in the service about all data sources.
|
void |
clearSource(java.lang.String sourceId)
delete all state information in the service about the given data source.
|
long |
countEntries(java.lang.String sourceId,
boolean countExact) |
void |
deleteEntry(java.lang.String sourceId,
DeltaService.EntryId entryId)
remove an entry, e.g. after it has been deleted.
|
java.util.Collection<java.lang.String> |
getShardPrefixes(java.lang.String sourceId)
get possible input values for
#getRecordIdsToDelete(String). |
java.util.Collection<java.lang.String> |
getSourceIds()
get Ids of all sources that currently have entries in the DeltaService.
|
java.util.Collection<DeltaService.EntryId> |
getUnvisitedEntries(java.lang.String sourceAndShardPrefix,
java.lang.String jobRunId)
get the record IDs in the given data source and shard that have not been visited in the given job run and therefore
must be sent as deleted records to the target job.
|
void |
markAsUpdated(java.lang.String sourceId,
java.lang.String recordId,
java.lang.String jobRunId,
java.lang.String hashCode)
Mark the record as visited in the current crawl job run.
|
void |
markAsUpdated(java.lang.String sourceId,
java.lang.String recordId,
java.lang.String compoundRecordId,
java.lang.String jobRunId,
java.lang.String hashCode)
Mark the record that was extracted from a compound as visited in the current crawl job run.
|
void |
markCompoundElementsVisited(java.lang.String sourceId,
java.lang.String compoundRecordId,
java.lang.String jobRunId)
Set jobRunId of all elements of the given compound record, because the compound itself has not changed.
|
State checkState(java.lang.String sourceId, java.lang.String recordId, java.lang.String jobRunId, java.lang.String hashCode) throws DeltaException
State.UPTODATE the
service also marks the record as visited in the current crawl job run already, so there is no need to call
markAsUpdated(String, String, String, String) afterwards. In the other cases the crawler should call
markAsUpdated(String, String, String, String) only if the record is actually submitted to a processing
job.sourceId - the name of the data source that contains the record.recordId - the record idjobRunId - the current job run id in which the crawler is running.hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if
such is available in record metadata, or even a hash calculated on the actual content of the record.State value.DeltaExceptionState checkState(java.lang.String sourceId, java.lang.String recordId, java.lang.String compoundRecordId, java.lang.String jobRunId, java.lang.String hashCode) throws DeltaException
State.UPTODATE the
service also marks the record as visited in the current crawl job run already, so there is no need to call
markAsUpdated(String, String, String, String) afterwards. In the other cases the crawler should call
markAsUpdated(String, String, String, String) only if the record is actually submitted to a processing
job.sourceId - the name of the data source that contains the record.recordId - the record idcompoundRecordId - the record id of the compound this record was extracted from. May be null.jobRunId - the current job run id in which the crawler is running.hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if
such is available in record metadata, or even a hash calculated on the actual content of the record.State value.DeltaExceptionvoid markCompoundElementsVisited(java.lang.String sourceId,
java.lang.String compoundRecordId,
java.lang.String jobRunId)
throws DeltaException
sourceId - compoundRecordId - jobRunId - DeltaExceptionvoid markAsUpdated(java.lang.String sourceId,
java.lang.String recordId,
java.lang.String jobRunId,
java.lang.String hashCode)
throws DeltaException
sourceId - the name of the data source that contains the record.recordId - the record idjobRunId - the current job run id in which the crawler is running.hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if
such is available in record metadata, or even a hash calculated on the actual content of the record.DeltaExceptionvoid markAsUpdated(java.lang.String sourceId,
java.lang.String recordId,
java.lang.String compoundRecordId,
java.lang.String jobRunId,
java.lang.String hashCode)
throws DeltaException
sourceId - the name of the data source that contains the record.recordId - the record idcompoundRecordId - the record id of the compound this record was extracted from. May be null.jobRunId - the current job run id in which the crawler is running.hashCode - a string that reflects changes in the record content. This can be as simple as a version identifier if
such is available in record metadata, or even a hash calculated on the actual content of the record.DeltaExceptionvoid clearSource(java.lang.String sourceId)
throws DeltaException
sourceId - data source name.DeltaExceptionvoid clearAll()
throws DeltaException
DeltaExceptionjava.util.Collection<java.lang.String> getSourceIds()
throws DeltaException
DeltaExceptionlong countEntries(java.lang.String sourceId,
boolean countExact)
throws DeltaException
sourceId - the name of the data source to examinecountExact - set to true to get an exact reault, but this may take some time. Else the service may return only an
estimated value.DeltaExceptionjava.util.Collection<java.lang.String> getShardPrefixes(java.lang.String sourceId)
throws DeltaException
#getRecordIdsToDelete(String). This makes it possible to parallelize and
distribute the check for records to delete.sourceId - the name of the data source to examine.DeltaExceptionjava.util.Collection<DeltaService.EntryId> getUnvisitedEntries(java.lang.String sourceAndShardPrefix, java.lang.String jobRunId) throws DeltaException
getShardPrefixes(String) and call this method with each of the shard-prefix
values.sourceAndShardPrefix - one of the values returned by getShardPrefixes(String)DeltaExceptionvoid deleteEntry(java.lang.String sourceId,
DeltaService.EntryId entryId)
throws DeltaException
sourceId - data source IdentryId - ID of the entry, e.g. as returned by getUnvisitedEntries(String, String)DeltaException