org.eclipse.smila.connectivity.framework.crawler.web.parse.js
Class JavascriptParserImpl
java.lang.Object
org.eclipse.smila.connectivity.framework.crawler.web.configuration.Configured
org.eclipse.smila.connectivity.framework.crawler.web.parse.js.JavascriptParserImpl
- All Implemented Interfaces:
- Configurable, JavascriptParser, Parser
public class JavascriptParserImpl
- extends Configured
- implements Parser, JavascriptParser
Extracts links from given javascript code. Extraction is based on regular expressions.
| Fields inherited from class org.eclipse.smila.connectivity.framework.crawler.web.configuration.Configured |
_configuration |
|
Method Summary |
java.lang.String[] |
getContentTypes()
Returns array of content-types that are supported by this parser. |
Outlink[] |
getOutlinks(java.lang.String scriptCode,
java.lang.String anchor,
java.lang.String base)
Returns links found in given javascript code. |
Parse |
getParse(Content content)
Creates the parse for some content. |
| Methods inherited from class org.eclipse.smila.connectivity.framework.crawler.web.configuration.Configured |
getConf, setConf |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
JavascriptParserImpl
public JavascriptParserImpl()
getContentTypes
public java.lang.String[] getContentTypes()
- Returns array of content-types that are supported by this parser.
- Specified by:
getContentTypes in interface Parser
- Returns:
- array of content-types.
getParse
public Parse getParse(Content content)
- Creates the parse for some content.
- Specified by:
getParse in interface Parser
- Parameters:
content - Content
- Returns:
- Parse
getOutlinks
public Outlink[] getOutlinks(java.lang.String scriptCode,
java.lang.String anchor,
java.lang.String base)
- Description copied from interface:
JavascriptParser
- Returns links found in given javascript code.
- Specified by:
getOutlinks in interface JavascriptParser
- Parameters:
scriptCode - String containing javascript code that will be parsed.anchor - Outlink anchorbase - URL of the page.
- Returns:
- Array of Outlinks