org.eclipse.smila.importing.crawler.web.filter
Class FilterConfiguration
java.lang.Object
org.eclipse.smila.importing.crawler.web.filter.FilterConfiguration
public class FilterConfiguration
- extends java.lang.Object
web crawler filter configuration, used by @ LinkFilter} class.
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
MAX_CRAWL_DEPTH
public static final java.lang.String MAX_CRAWL_DEPTH
- See Also:
- Constant Field Values
FOLLOW_REDIRECTS
public static final java.lang.String FOLLOW_REDIRECTS
- See Also:
- Constant Field Values
MAX_REDIRECTS
public static final java.lang.String MAX_REDIRECTS
- See Also:
- Constant Field Values
URL_PATTERNS
public static final java.lang.String URL_PATTERNS
- See Also:
- Constant Field Values
INCLUDE_PATTERNS
public static final java.lang.String INCLUDE_PATTERNS
- See Also:
- Constant Field Values
EXCLUDE_PATTERNS
public static final java.lang.String EXCLUDE_PATTERNS
- See Also:
- Constant Field Values
FilterConfiguration
public FilterConfiguration(AnyMap filterConfig)
- Parameters:
filterConfig - filter section from file crawler configuration.
getUrlPatternMatcher
public RegexPatternMatcher getUrlPatternMatcher()
- Returns:
- matcher for checking include and exclude patterns of URLs.
followRedirects
public boolean followRedirects()
- Returns:
- 'true' if we should follow redirects, 'false' otherwise.
getMaxRedirects
public long getMaxRedirects()
- Returns:
- the maximum number of allowed redirects when following links.
getMaxCrawlDepth
public long getMaxCrawlDepth()
- Returns:
- the maximum depth when following links.