|
SMILA 1.0 API documentation | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.eclipse.smila.importing.crawler.file.FileCrawlerWorker
public class FileCrawlerWorker
Worker implementation that performs file crawling.
| Field Summary | |
|---|---|
static java.lang.Long |
DIRS_PER_BULK_DEFAULT
default: one directory per follow-up task. |
static java.lang.String |
INPUT_SLOT_DIRS_TO_CRAWL
name of input slot containing records with directories to crawl. |
static java.lang.Long |
MAX_FILES_PER_BULK_DEFAULT
default: write up to 1000 files to one file bulk. |
static java.lang.Long |
MIN_FILES_PER_BULK_DEFAULT
default: don't add files from subdirectories, if current folder has too few files. |
static java.lang.String |
NAME
Name of the worker, used in worker description and workflows. |
static java.lang.String |
OUTPUT_SLOT_DIRS_TO_CRAWL
name of output slot taking the directories to crawl in follow-up tasks. |
static java.lang.String |
OUTPUT_SLOT_FILES_TO_CRAWL
name of output slot taking the file records to process in ETL. |
static java.lang.String |
TASK_PARAM_DIRS_PER_BULK
number of directories to write to one bulk object. |
static java.lang.String |
TASK_PARAM_MAX_FILES_PER_BULK
Maximum number of files in one bulk object. |
static java.lang.String |
TASK_PARAM_MIN_FILES_PER_BULK
Minimum number of files in one bulk object. |
static java.lang.String |
TASK_PARAM_ROOT_FOLDER
Name of the task parameter that contains the root folder for crawling. |
| Constructor Summary | |
|---|---|
FileCrawlerWorker()
|
|
| Method Summary | |
|---|---|
java.lang.String |
getName()
|
void |
perform(TaskContext taskContext)
Performs a computation on the data available in the TaskContext, such as a task for this worker, input and
(if configured) output slots. |
void |
setCompoundExtractor(CompoundExtractor compoundExtractor)
DS service reference bind method. |
void |
setFileCrawlerService(FileCrawlerService fileCrawler)
DS service reference bind method. |
void |
setVisitedLinks(VisitedLinksService visitedLinks)
DS service reference bind method. |
void |
unsetCompoundExtractor(CompoundExtractor compoundExtractor)
DS service reference unbind method. |
void |
unsetFileCrawlerService(FileCrawlerService fileCrawler)
DS service reference unbind method. |
void |
unsetVisitedLinks(VisitedLinksService visitedLinks)
DS service reference unbind method. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String NAME
public static final java.lang.String INPUT_SLOT_DIRS_TO_CRAWL
public static final java.lang.String OUTPUT_SLOT_DIRS_TO_CRAWL
public static final java.lang.String OUTPUT_SLOT_FILES_TO_CRAWL
public static final java.lang.String TASK_PARAM_ROOT_FOLDER
public static final java.lang.String TASK_PARAM_MAX_FILES_PER_BULK
public static final java.lang.String TASK_PARAM_MIN_FILES_PER_BULK
public static final java.lang.String TASK_PARAM_DIRS_PER_BULK
public static final java.lang.Long MAX_FILES_PER_BULK_DEFAULT
public static final java.lang.Long MIN_FILES_PER_BULK_DEFAULT
public static final java.lang.Long DIRS_PER_BULK_DEFAULT
| Constructor Detail |
|---|
public FileCrawlerWorker()
| Method Detail |
|---|
public java.lang.String getName()
getName in interface Worker
public void perform(TaskContext taskContext)
throws java.lang.Exception
WorkerTaskContext, such as a task for this worker, input and
(if configured) output slots. An implementor must make sure, calls to this method must be thread-safe!
perform in interface WorkertaskContext - the TaskContext information with which this operation can be performed.
java.lang.Exceptionpublic void setFileCrawlerService(FileCrawlerService fileCrawler)
public void unsetFileCrawlerService(FileCrawlerService fileCrawler)
public void setCompoundExtractor(CompoundExtractor compoundExtractor)
public void unsetCompoundExtractor(CompoundExtractor compoundExtractor)
public void setVisitedLinks(VisitedLinksService visitedLinks)
public void unsetVisitedLinks(VisitedLinksService visitedLinks)
|
SMILA 1.0 API documentation | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||