SMILA 1.0 API documentation

org.eclipse.smila.connectivity.framework.crawler.web.http
Class HttpBase

java.lang.Object
  extended by org.eclipse.smila.connectivity.framework.crawler.web.http.HttpBase
Direct Known Subclasses:
Http

public abstract class HttpBase
extends java.lang.Object

Common configurations and methods for HTTP protocol.


Field Summary
protected  Authentication _authentication
          The authentication.
protected  int _connectTimeout
          The connect timeout.
protected  boolean _cookiesEnabled
          The cookies enabled.
protected  java.util.List<Header> _headers
          The headers.
protected  int _maxLengthBytes
          The max length bytes.
protected  java.lang.String _proxyHost
          The proxy host.
protected  java.lang.String _proxyLogin
          The proxy login.
protected  java.lang.String _proxyPassword
          The proxy password.
protected  int _proxyPort
          The proxy port.
protected  int _readTimeout
          The read timeout.
protected  java.lang.String _referrer
          The referrer.
protected  int _timeout
          The timeout.
protected  boolean _useHttp11
          Do we use HTTP/1.1?
protected  boolean _useProxy
          The use proxy.
protected  java.lang.String _userAgent
          The 'User-Agent' request header.
static int BUFFER_SIZE
          The Constant BUFFER_SIZE.
 
Constructor Summary
HttpBase()
          Creates a new instance of HttpBase.
 
Method Summary
 Configuration getConf()
          Return the configuration used by this object.
 java.util.List<Header> getHeaders()
          Returns list of request headers.
 HttpOutput getHttpOutput(Outlink link, FilterProcessor filterProcessor)
          Returns retrieved page information in the HttpOutput format.
 int getMaxLengthBytes()
          Returns maximum length of document.
 java.lang.String getReferer()
          Returns the Referrer header value.
protected abstract  Response getResponse(java.lang.String urlString)
          Returns HttpResponse for the given URL.
protected abstract  Response getResponse(java.lang.String urlString, FilterProcessor filterProcessor)
          Returns HttpResponse for the given URL and filter processor.
 boolean getUseHttp11()
          Returns whether HTTP version 1.1 will be used or not.
 java.lang.String getUserAgent()
          Returns User-Agent value.
 boolean isCookiesEnabled()
          Returns if cookies are enabled or not.
 byte[] processGzipEncoded(byte[] compressed, java.lang.String url)
          Holds uncompressing of GZIP content.
 void setConf(Configuration conf)
          Loads configuration.
 void setCookiesEnabled(boolean enableCookies)
          Assigns boolean value to enable or disable cookies.
 void setHeaders(java.util.List<Header> headersList)
          Assigns list of request headers.
 void setReferer(java.lang.String refererValue)
          Assigns the Referrer header value.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BUFFER_SIZE

public static final int BUFFER_SIZE
The Constant BUFFER_SIZE.

See Also:
Constant Field Values

_userAgent

protected java.lang.String _userAgent
The 'User-Agent' request header.


_referrer

protected java.lang.String _referrer
The referrer.


_headers

protected java.util.List<Header> _headers
The headers.


_useHttp11

protected boolean _useHttp11
Do we use HTTP/1.1?


_proxyHost

protected java.lang.String _proxyHost
The proxy host.


_proxyPort

protected int _proxyPort
The proxy port.


_proxyLogin

protected java.lang.String _proxyLogin
The proxy login.


_proxyPassword

protected java.lang.String _proxyPassword
The proxy password.


_useProxy

protected boolean _useProxy
The use proxy.


_maxLengthBytes

protected int _maxLengthBytes
The max length bytes.


_timeout

protected int _timeout
The timeout.


_connectTimeout

protected int _connectTimeout
The connect timeout.


_readTimeout

protected int _readTimeout
The read timeout.


_authentication

protected Authentication _authentication
The authentication.


_cookiesEnabled

protected boolean _cookiesEnabled
The cookies enabled.

Constructor Detail

HttpBase

public HttpBase()
Creates a new instance of HttpBase.

Method Detail

setConf

public void setConf(Configuration conf)
Loads configuration.

Parameters:
conf - Configuration

getConf

public Configuration getConf()
Return the configuration used by this object.

Returns:
Configuration

getHttpOutput

public HttpOutput getHttpOutput(Outlink link,
                                FilterProcessor filterProcessor)
Returns retrieved page information in the HttpOutput format.

Parameters:
link - Out link to retrieve.
filterProcessor - FilterProcessor implementation
Returns:
HttpOutput

processGzipEncoded

public byte[] processGzipEncoded(byte[] compressed,
                                 java.lang.String url)
                          throws java.io.IOException
Holds uncompressing of GZIP content.

Parameters:
compressed - GZIP byte array
url - URL string
Returns:
byte array
Throws:
java.io.IOException - if uncompressing error occur

getUserAgent

public java.lang.String getUserAgent()
Returns User-Agent value.

Returns:
String

getUseHttp11

public boolean getUseHttp11()
Returns whether HTTP version 1.1 will be used or not.

Returns:
true or false

getMaxLengthBytes

public int getMaxLengthBytes()
Returns maximum length of document.

Returns:
maximum length

getHeaders

public java.util.List<Header> getHeaders()
Returns list of request headers.

Returns:
List of headers

setHeaders

public void setHeaders(java.util.List<Header> headersList)
Assigns list of request headers.

Parameters:
headersList - List of headers

getReferer

public java.lang.String getReferer()
Returns the Referrer header value.

Returns:
String

setReferer

public void setReferer(java.lang.String refererValue)
Assigns the Referrer header value.

Parameters:
refererValue - String

isCookiesEnabled

public boolean isCookiesEnabled()
Returns if cookies are enabled or not.

Returns:
boolean

setCookiesEnabled

public void setCookiesEnabled(boolean enableCookies)
Assigns boolean value to enable or disable cookies.

Parameters:
enableCookies - boolean

getResponse

protected abstract Response getResponse(java.lang.String urlString)
                                 throws java.io.IOException
Returns HttpResponse for the given URL.

Parameters:
urlString - the url string
Returns:
HttpResponse
Throws:
java.io.IOException - if there was a error retrieving URL.

getResponse

protected abstract Response getResponse(java.lang.String urlString,
                                        FilterProcessor filterProcessor)
                                 throws java.io.IOException
Returns HttpResponse for the given URL and filter processor.

Parameters:
filterProcessor - filterProcessor implementation
urlString - the url string
Returns:
HttpResponse
Throws:
java.io.IOException - if there was a error retrieving URL.

SMILA 1.0 API documentation