|
SMILA 1.0 API documentation | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.eclipse.smila.importing.crawler.web.WebCrawlerWorker
public class WebCrawlerWorker
Worker for Web crawling.
| Field Summary | |
|---|---|
static java.lang.String |
INPUT_SLOT_LINKS_TO_CRAWL
name of input slot containing the links to crawl. |
static java.lang.String |
NAME
Name of the worker, used in worker description and workflows. |
static java.lang.String |
OUTPUT_SLOT_CRAWLED_RECORDS
name of input slot containing the crawled records. |
static java.lang.String |
OUTPUT_SLOT_LINKS_TO_CRAWL
name of output slot containing the links to crawl. |
| Constructor Summary | |
|---|---|
WebCrawlerWorker()
|
|
| Method Summary | |
|---|---|
java.lang.String |
getName()
|
void |
perform(TaskContext taskContext)
Performs a computation on the data available in the TaskContext, such as a task for this worker, input and
(if configured) output slots. |
void |
setFetcher(Fetcher fetcher)
DS service reference injection method. |
void |
setLinkExtractor(LinkExtractor linkExtractor)
DS service reference injection method. |
void |
setLinkFilter(LinkFilter linkFilter)
DS service reference injection method. |
void |
setRecordProducer(RecordProducer recordProducer)
DS service reference injection method. |
void |
setVisitedLinks(VisitedLinksService visitedLinks)
DS service reference injection method. |
void |
unsetFetcher(Fetcher fetcher)
DS service reference removal method. |
void |
unsetLinkExtractor(LinkExtractor linkExtractor)
DS service reference removal method. |
void |
unsetLinkFilter(LinkFilter linkFilter)
DS service reference removal method. |
void |
unsetRecordProducer(RecordProducer recordProducer)
DS service reference removal method. |
void |
unsetVisitedLinks(VisitedLinksService visitedLinks)
DS service reference removal method. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String NAME
public static final java.lang.String INPUT_SLOT_LINKS_TO_CRAWL
public static final java.lang.String OUTPUT_SLOT_LINKS_TO_CRAWL
public static final java.lang.String OUTPUT_SLOT_CRAWLED_RECORDS
| Constructor Detail |
|---|
public WebCrawlerWorker()
| Method Detail |
|---|
public java.lang.String getName()
getName in interface Worker
public void perform(TaskContext taskContext)
throws java.lang.Exception
WorkerTaskContext, such as a task for this worker, input and
(if configured) output slots. An implementor must make sure, calls to this method must be thread-safe!
perform in interface WorkertaskContext - the TaskContext information with which this operation can be performed.
java.lang.Exceptionpublic void setVisitedLinks(VisitedLinksService visitedLinks)
public void unsetVisitedLinks(VisitedLinksService visitedLinks)
public void setFetcher(Fetcher fetcher)
public void unsetFetcher(Fetcher fetcher)
public void setLinkExtractor(LinkExtractor linkExtractor)
public void unsetLinkExtractor(LinkExtractor linkExtractor)
public void setLinkFilter(LinkFilter linkFilter)
public void unsetLinkFilter(LinkFilter linkFilter)
public void setRecordProducer(RecordProducer recordProducer)
public void unsetRecordProducer(RecordProducer recordProducer)
|
SMILA 1.0 API documentation | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||