SMILA 1.0 API documentation

org.eclipse.smila.importing.crawler.web
Interface LinkExtractor

All Known Implementing Classes:
DefaultLinkExtractor

public interface LinkExtractor

Extract links from content contained in input record.


Method Summary
 java.util.Collection<Record> extractLinks(Record inputRecord, WebCrawlingContext context)
           
 

Method Detail

extractLinks

java.util.Collection<Record> extractLinks(Record inputRecord,
                                          WebCrawlingContext context)
                                          throws WebCrawlerException
Parameters:
inputRecord - input record with content
context - the web crawling context
Returns:
for each extracted link a new record is created
Throws:
WebCrawlerException

SMILA 1.0 API documentation