SMILA 1.0 API documentation

org.eclipse.smila.importing.crawler.web
Class WebCrawlerConstants

java.lang.Object
  extended by org.eclipse.smila.importing.crawler.web.WebCrawlerConstants

public final class WebCrawlerConstants
extends java.lang.Object

constants used by web crawler and subcomponents: attribute and attachment names, task parameters.


Field Summary
static java.lang.String ATTACHMENT_CONTENT
          name of attachment containing the content of a web resource.
static java.lang.String ATTRIBUTE_CHARSET
          name of attribute containing the charset of the web resource reported by the web server (if any).
static java.lang.String ATTRIBUTE_CONTENTTYPE
          name of attribute containing the content-type of the web resource reported by the web server (if any).
static java.lang.String ATTRIBUTE_CRAWL_DEPTH
          internal attribute used to apply max crawl depth.
static java.lang.String ATTRIBUTE_LASTMODIFIED
          name of attribute containing the last-modified header reported by the web server (if any).
static java.lang.String ATTRIBUTE_MIMETYPE
          name of attribute containing the mimetype of the web resource reported by the web server.
static java.lang.String ATTRIBUTE_SIZE
          name of attribute containing the content-length of the web resource reported by the web server (if any).
static java.lang.String ATTRIBUTE_URL
          name of attribute containing the URL of the web resource.
static int DEFAULT_LINKS_PER_BULK
          default value for 'linksPerBulk' parameter.
static java.util.Set<java.lang.String> PROPERTY_NAMES
          the property names the web ETL workers should support for mapping.
static java.lang.String TASK_PARAM_LINKS_PER_BULK
          Name of the task parameter that contains the number of links to write to one bulk object.
static java.lang.String TASK_PARAM_START_URL
          Name of the task parameter that contains the start URL for crawling.
static java.lang.String TASK_PARAM_WAIT_BETWEEN_REQUESTS
          Name of the task parameter that contains a long value in milliseconds on how long to wait between http requests.
 
Method Summary
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ATTRIBUTE_URL

public static final java.lang.String ATTRIBUTE_URL
name of attribute containing the URL of the web resource.

See Also:
Constant Field Values

ATTRIBUTE_LASTMODIFIED

public static final java.lang.String ATTRIBUTE_LASTMODIFIED
name of attribute containing the last-modified header reported by the web server (if any).

See Also:
Constant Field Values

ATTRIBUTE_CONTENTTYPE

public static final java.lang.String ATTRIBUTE_CONTENTTYPE
name of attribute containing the content-type of the web resource reported by the web server (if any).

See Also:
Constant Field Values

ATTRIBUTE_MIMETYPE

public static final java.lang.String ATTRIBUTE_MIMETYPE
name of attribute containing the mimetype of the web resource reported by the web server. (if any).

See Also:
Constant Field Values

ATTRIBUTE_CHARSET

public static final java.lang.String ATTRIBUTE_CHARSET
name of attribute containing the charset of the web resource reported by the web server (if any).

See Also:
Constant Field Values

ATTRIBUTE_SIZE

public static final java.lang.String ATTRIBUTE_SIZE
name of attribute containing the content-length of the web resource reported by the web server (if any).

See Also:
Constant Field Values

ATTACHMENT_CONTENT

public static final java.lang.String ATTACHMENT_CONTENT
name of attachment containing the content of a web resource.

See Also:
Constant Field Values

ATTRIBUTE_CRAWL_DEPTH

public static final java.lang.String ATTRIBUTE_CRAWL_DEPTH
internal attribute used to apply max crawl depth.

See Also:
Constant Field Values

TASK_PARAM_START_URL

public static final java.lang.String TASK_PARAM_START_URL
Name of the task parameter that contains the start URL for crawling.

See Also:
Constant Field Values

TASK_PARAM_WAIT_BETWEEN_REQUESTS

public static final java.lang.String TASK_PARAM_WAIT_BETWEEN_REQUESTS
Name of the task parameter that contains a long value in milliseconds on how long to wait between http requests.

See Also:
Constant Field Values

TASK_PARAM_LINKS_PER_BULK

public static final java.lang.String TASK_PARAM_LINKS_PER_BULK
Name of the task parameter that contains the number of links to write to one bulk object.

See Also:
Constant Field Values

PROPERTY_NAMES

public static final java.util.Set<java.lang.String> PROPERTY_NAMES
the property names the web ETL workers should support for mapping.


DEFAULT_LINKS_PER_BULK

public static final int DEFAULT_LINKS_PER_BULK
default value for 'linksPerBulk' parameter.

See Also:
Constant Field Values

SMILA 1.0 API documentation