SMILA 1.0 API documentation

org.eclipse.smila.importing.crawler.web.filter
Class FilterConfiguration

java.lang.Object
  extended by org.eclipse.smila.importing.crawler.web.filter.FilterConfiguration

public class FilterConfiguration
extends java.lang.Object

web crawler filter configuration, used by @ LinkFilter} class.


Field Summary
static java.lang.String EXCLUDE_PATTERNS
           
static java.lang.String FOLLOW_REDIRECTS
           
static java.lang.String INCLUDE_PATTERNS
           
static java.lang.String MAX_CRAWL_DEPTH
           
static java.lang.String MAX_REDIRECTS
           
static java.lang.String URL_PATTERNS
           
 
Constructor Summary
FilterConfiguration(AnyMap filterConfig)
           
 
Method Summary
 boolean followRedirects()
           
 long getMaxCrawlDepth()
           
 long getMaxRedirects()
           
 RegexPatternMatcher getUrlPatternMatcher()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MAX_CRAWL_DEPTH

public static final java.lang.String MAX_CRAWL_DEPTH
See Also:
Constant Field Values

FOLLOW_REDIRECTS

public static final java.lang.String FOLLOW_REDIRECTS
See Also:
Constant Field Values

MAX_REDIRECTS

public static final java.lang.String MAX_REDIRECTS
See Also:
Constant Field Values

URL_PATTERNS

public static final java.lang.String URL_PATTERNS
See Also:
Constant Field Values

INCLUDE_PATTERNS

public static final java.lang.String INCLUDE_PATTERNS
See Also:
Constant Field Values

EXCLUDE_PATTERNS

public static final java.lang.String EXCLUDE_PATTERNS
See Also:
Constant Field Values
Constructor Detail

FilterConfiguration

public FilterConfiguration(AnyMap filterConfig)
Parameters:
filterConfig - filter section from file crawler configuration.
Method Detail

getUrlPatternMatcher

public RegexPatternMatcher getUrlPatternMatcher()
Returns:
matcher for checking include and exclude patterns of URLs.

followRedirects

public boolean followRedirects()
Returns:
'true' if we should follow redirects, 'false' otherwise.

getMaxRedirects

public long getMaxRedirects()
Returns:
the maximum number of allowed redirects when following links.

getMaxCrawlDepth

public long getMaxCrawlDepth()
Returns:
the maximum depth when following links.

SMILA 1.0 API documentation