public class FileExtractorWorker extends ExtractorWorkerBase
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
NAME
name of worker.
|
Constructor and Description |
---|
FileExtractorWorker() |
Modifier and Type | Method and Description |
---|---|
protected Record |
convertRecord(Record compoundRecord,
Record extractedRecord,
TaskContext taskContext)
create a record from the extracted record that conforms to the records produced by the matching crawler.
|
protected boolean |
filterRecord(Record record,
TaskContext taskContext)
Filter extracted records.
|
protected ContentFetcher |
getContentFetcher()
get a content fetcher for the data source type.
|
java.lang.String |
getName() |
protected java.util.Iterator<Record> |
invokeExtractor(CompoundExtractor extractor,
Record compoundRecord,
java.io.InputStream compoundContent,
TaskContext taskContext)
invoke extractor with data from the crawled record.
|
void |
setFileCrawlerService(FileCrawlerService fileCrawler)
DS service reference bind method.
|
void |
unsetFileCrawlerService(FileCrawlerService fileCrawler)
DS service reference unbind method.
|
concatAttributeValues, copyAttachment, copyAttribute, copyCompoundAttributes, copySetToStringAttribute, mapRecord, perform, setCompoundExtractor, unsetCompoundExtractor
public static final java.lang.String NAME
public java.lang.String getName()
protected java.util.Iterator<Record> invokeExtractor(CompoundExtractor extractor, Record compoundRecord, java.io.InputStream compoundContent, TaskContext taskContext) throws CompoundExtractorException
ExtractorWorkerBase
invokeExtractor
in class ExtractorWorkerBase
CompoundExtractorException
protected Record convertRecord(Record compoundRecord, Record extractedRecord, TaskContext taskContext)
ExtractorWorkerBase
convertRecord
in class ExtractorWorkerBase
protected boolean filterRecord(Record record, TaskContext taskContext)
filterRecord
in class ExtractorWorkerBase
record
- the record to checktaskContext
- the task context containing the task parameterstrue
if the record passes the filter(s), false
if not.protected ContentFetcher getContentFetcher()
ExtractorWorkerBase
getContentFetcher
in class ExtractorWorkerBase
public void setFileCrawlerService(FileCrawlerService fileCrawler)
public void unsetFileCrawlerService(FileCrawlerService fileCrawler)