|
SMILA 1.0 API documentation | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.eclipse.smila.importing.compounds.ExtractorWorkerBase
public abstract class ExtractorWorkerBase
base implementation for workers doing compound extraction. Subclasses must provide a ContentFetcher
implementation and a method that converts the records produced by the CompoundExtractor to records that are
compatible with the associated crawler worker.
| Constructor Summary | |
|---|---|
ExtractorWorkerBase()
|
|
| Method Summary | |
|---|---|
protected void |
concatAttributeValues(Record sourceRecord,
java.lang.String sourceAttribute,
Record targetRecord,
java.lang.String targetAttribute,
java.lang.String separator)
utility method for subclasses: concat a source attribute value to a target attribute string value. |
protected abstract Record |
convertRecord(Record compoundRecord,
Record extractedRecord,
TaskContext taskContext)
create a record from the extracted record that conforms to the records produced by the matching crawler. |
protected void |
copyAttachment(Record sourceRecord,
Record targetRecord,
java.lang.String attachmentName)
utility method for subclasses: copy attachment from sourceRecord to targetRecord, if it exists. |
protected void |
copyAttribute(Record sourceRecord,
java.lang.String sourceAttribute,
Record targetRecord,
java.lang.String targetAttribute)
utility method for subclasses: copy an attribute if it exists. |
protected void |
copyCompoundAttributes(Record compoundRecord,
Record extractedRecord,
Record convertedRecord)
add compound related system attributes to the converted record. |
protected void |
copySetToStringAttribute(Record sourceRecord,
java.lang.String sourceAttribute,
Record targetRecord,
java.lang.String targetAttribute,
java.lang.String separator)
utility method for subclasses: copy a set attribute to a plain string attribute. |
protected boolean |
filterRecord(Record record,
TaskContext taskContext)
Filter extracted records. |
protected abstract ContentFetcher |
getContentFetcher()
get a content fetcher for the data source type. |
protected abstract java.util.Iterator<Record> |
invokeExtractor(CompoundExtractor extractor,
Record compoundRecord,
java.io.InputStream compoundContent,
TaskContext taskContext)
invoke extractor with data from the crawled record. |
protected void |
mapRecord(Record record,
TaskContext taskContext)
Hook for subclasses to support mapping of the converted record according to mapping rules. |
void |
perform(TaskContext taskContext)
Performs a computation on the data available in the TaskContext, such as a task for this worker, input and
(if configured) output slots. |
void |
setCompoundExtractor(CompoundExtractor extractor)
DS service reference bind method. |
void |
unsetCompoundExtractor(CompoundExtractor extractor)
DS service reference unbind method. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface org.eclipse.smila.taskworker.Worker |
|---|
getName |
| Constructor Detail |
|---|
public ExtractorWorkerBase()
| Method Detail |
|---|
public void perform(TaskContext taskContext)
throws java.lang.Exception
WorkerTaskContext, such as a task for this worker, input and
(if configured) output slots. An implementor must make sure, calls to this method must be thread-safe!
perform in interface WorkertaskContext - the TaskContext information with which this operation can be performed.
java.lang.Exception
protected void mapRecord(Record record,
TaskContext taskContext)
record - the RecordtaskContext - the TaskContext
protected boolean filterRecord(Record record,
TaskContext taskContext)
record - the record to checktaskContext - the task context containing the task parameters
true if the record passes the filter(s), false if not.
protected abstract java.util.Iterator<Record> invokeExtractor(CompoundExtractor extractor,
Record compoundRecord,
java.io.InputStream compoundContent,
TaskContext taskContext)
throws CompoundExtractorException
CompoundExtractorException
protected abstract Record convertRecord(Record compoundRecord,
Record extractedRecord,
TaskContext taskContext)
protected abstract ContentFetcher getContentFetcher()
protected void copyAttachment(Record sourceRecord,
Record targetRecord,
java.lang.String attachmentName)
protected void copyAttribute(Record sourceRecord,
java.lang.String sourceAttribute,
Record targetRecord,
java.lang.String targetAttribute)
protected void copySetToStringAttribute(Record sourceRecord,
java.lang.String sourceAttribute,
Record targetRecord,
java.lang.String targetAttribute,
java.lang.String separator)
protected void concatAttributeValues(Record sourceRecord,
java.lang.String sourceAttribute,
Record targetRecord,
java.lang.String targetAttribute,
java.lang.String separator)
protected void copyCompoundAttributes(Record compoundRecord,
Record extractedRecord,
Record convertedRecord)
public void setCompoundExtractor(CompoundExtractor extractor)
public void unsetCompoundExtractor(CompoundExtractor extractor)
|
SMILA 1.0 API documentation | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||