public interface LinkFilter
LinkExtractor to select only
those links that should really be followed in follow-up tasks.| Modifier and Type | Method and Description |
|---|---|
boolean |
allowRedirectLink(java.lang.String link,
java.lang.String originalUrl,
WebCrawlingContext context)
Check if it is allowed to follow a given redirect link.
|
java.util.Collection<Record> |
filterExtractedLinks(java.util.Collection<Record> extractedLinks,
java.lang.String originalUrl,
WebCrawlingContext context)
filter links extracted from given source URL.
|
java.util.Collection<Record> filterExtractedLinks(java.util.Collection<Record> extractedLinks, java.lang.String originalUrl, WebCrawlingContext context) throws WebCrawlerException
extractedLinks - result from LinkExtractor service.originalUrl - the source URL from which the links were extractedcontext - the WebCrawlingContext.WebCrawlerExceptionboolean allowRedirectLink(java.lang.String link,
java.lang.String originalUrl,
WebCrawlingContext context)
throws WebCrawlerException
link - a String containing the link to be checkedoriginalUrl - the original URL that was redirected.context - the WebCrawlingContext.WebCrawlerException