SMILA 1.0 API documentation

org.eclipse.smila.connectivity.framework.crawler.web.parse
Class BinaryParser

java.lang.Object
  extended by org.eclipse.smila.connectivity.framework.crawler.web.parse.BinaryParser
All Implemented Interfaces:
Configurable, Parser

public class BinaryParser
extends java.lang.Object
implements Parser, Configurable

Fallback implementation to use when no more specialized parser is available. Makes it possible to use the content of binary files linked from web pages (e.g. images or PDF files), the content can of course not be scanned for outgoing links or metadata. Metadata from response headers can be used.


Constructor Summary
BinaryParser()
           
 
Method Summary
 Configuration getConf()
          Return the configuration used by this object.
 java.lang.String[] getContentTypes()
          Returns array of content-types that are supported by this parser.
 Parse getParse(Content content)
          Creates the parse for some content.
 void setConf(Configuration configuration)
          Set the configuration to be used by this object.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BinaryParser

public BinaryParser()
Method Detail

setConf

public void setConf(Configuration configuration)
Description copied from interface: Configurable
Set the configuration to be used by this object.

Specified by:
setConf in interface Configurable
Parameters:
configuration - Configuration

getConf

public Configuration getConf()
Description copied from interface: Configurable
Return the configuration used by this object.

Specified by:
getConf in interface Configurable
Returns:
Configuration

getParse

public Parse getParse(Content content)
Description copied from interface: Parser
Creates the parse for some content.

Specified by:
getParse in interface Parser
Parameters:
content - Content
Returns:
Parse

getContentTypes

public java.lang.String[] getContentTypes()
Description copied from interface: Parser
Returns array of content-types that are supported by this parser.

Specified by:
getContentTypes in interface Parser
Returns:
array of content-types.

SMILA 1.0 API documentation