SMILA 1.0 API documentation

Package org.eclipse.smila.importing.crawler.web

Interface Summary
Fetcher Interface for Fetcher service of the WebCrawlerWorker and WebFetcherWorker.
LinkExtractor Extract links from content contained in input record.
LinkFilter interface for LinkFilter services.
RecordProducer Produces resulting records from fetched input record.
 

Class Summary
WebCrawlerConstants constants used by web crawler and subcomponents: attribute and attachment names, task parameters.
WebCrawlerWorker Worker for Web crawling.
WebCrawlingContext Context holding information needed throughout most of the web crawling process like mapper, filter confiruration etc.
WebExtractorWorker Compound extractor worker to use in web crawling workflows.
WebFetcherWorker Fetches binary content from URL and stores the content as record attachment.
 

Exception Summary
WebCrawlerException exceptions thrown by WebCrawler components.
 


SMILA 1.0 API documentation