SMILA 1.0 API documentation

org.eclipse.smila.importing.crawler.web
Interface LinkFilter

All Known Implementing Classes:
SimpleLinkFilter

public interface LinkFilter

interface for LinkFilter services. The LinkFilter is called on the result of the LinkExtractor to select only those links that should really be followed in follow-up tasks.


Method Summary
 java.util.Collection<Record> filterLinks(java.util.Collection<Record> extractedLinks, Record sourceLink, AnyMap parameters, TaskLog taskLog)
          filter extracted links.
 

Method Detail

filterLinks

java.util.Collection<Record> filterLinks(java.util.Collection<Record> extractedLinks,
                                         Record sourceLink,
                                         AnyMap parameters,
                                         TaskLog taskLog)
                                         throws WebCrawlerException
filter extracted links.

Parameters:
extractedLinks - result from LinkExtractor service.
sourceLink - record from which links where extracted.
parameters - task parameters, can configure the operation.
taskLog - log facility provided by WorkerManager.
Returns:
links to follow in follow-up tasks
Throws:
WebCrawlerException - error in processing the links.

SMILA 1.0 API documentation