|
SMILA 1.0 API documentation | ||||||||
PREV NEXT | FRAMES NO FRAMES |
#PROPERTY_ROOTCONTEXTPATH
property.
TaskGeneratorBase.PROPERTY_GENERATOR_NAME
in component description.
SecurityAttributes.AccessRightType
and
SecurityAttributes.EntityType
.
TaskKeepAliveListener
to this instance of TaskKeepAlive
.
TaskKeepAliveListener
to this instance of TaskKeepAlive
.
StreamOutput
and
RecordOutput
but uses the Append API to create the result object.Any
record converted to BON to the data object.
Record
converted to BON to the data object.
PreparedStatementTypedParameter
incorporates to the passed PreparedStatement
-object.
AnyMap
object or throws an InvalidValueTypeException
.
AnyMap
object or throws an InvalidValueTypeException
.
AnyMap
object or throws an InvalidValueTypeException
.
AnySeq
object or throws an InvalidValueTypeException
.
AnySeq
object or throws an InvalidValueTypeException
.
AnySeq
object or throws an InvalidValueTypeException
.
Value
object or throws an InvalidValueTypeException
.
Value
object or throws an InvalidValueTypeException
.
Value
object or throws an InvalidValueTypeException
.
name
attribute of process elements.
operation
attribute of invoke elements.
partnerLink
attribute of invoke elements.
portType
attribute of invoke elements.
true
by the deltaCheck worker to mark records that are not new but have been changed.
ObjectStoreException
for errors that are caused by invalid arguments or other conditions that
make it impossible to perform an operation.Record
objects from an input stream.#hasChanged(Id, String)
and #touch(Id, String, boolean)
in one step.
JdbcCrawler
can release its
JDBC- and other resources.
PerformanceCounter#getNextSampleValue()
.Configuration
.Configuration
.Task.PROPERTY_ORIGINAL_QUALIFIER
property.
SolrServer
instance for the given core.
DataSourceConnectionConfigPlugin
-interface of the Jdbc-Crawler-Bundle.DefinitionBase
from an overriding definition.
DeltaService
on error in operations.ODEServer
.BinaryStorageService.storeRecordAttachment
) from binary
storage as byte array.
BinaryStorageService.storeRecordAttachment
) from binary
storage as byte array.
BinaryStorageService.storeRecordAttachment
) from binary
storage as InputStream.
BinaryStorageService.storeRecordAttachment
) from binary
storage as InputStream.
Fetcher
job.SimpleLinkFilter
.FilterFactory
instance with the given configuration.
FilterProcessor
.<invoke>
element.
SecurityAttributes.ACCESS_RIGHTS
attribute.
SecurityAttributes.AccessRightType
.
SecurityAttributes.AccessRightType
and SecurityAttributes.EntityType
.
HttpServletRequest
.
Authentication
options for this crawl job.
ConfigUtils.getConfigFile(String, String)
as this may also return
non-directory entries.
CrawlScopeFilter
for this crawl job.
DefinitionPersistence
service.
JsonRequestHandler.getErrorStatus(String, String, Throwable)
.
AJobManagerHandler.getErrorStatus(String, String, Throwable)
.
AJobManagerHandler.getErrorStatus(String, String, Throwable)
.
AJobManagerHandler.getErrorStatus(String, String, Throwable)
.
ObjectStoreException
and subclasses.
JsonRequestHandler.getErrorStatus(String, String, Throwable)
.
JsonRequestHandler.getErrorStatus(String, String, Throwable)
.
JsonRequestHandler.getErrorStatus(String, String, Throwable)
.
JsonRequestHandler.getErrorStatus(String, String, Throwable)
.
JsonRequestHandler.getErrorStatus(String, String, Throwable)
.
Handler
instance, which in this case is a ResourceHandler
instance.
Handler
instance, which in this case is a
org.eclipse.jetty.server.handler.ResourceHandler
instance.
JobRunDataProvider
service.
JobRunEngine
service.
JobTaskProcessor
service.
Blackboard.getRecord(String)
+ Record.getMetadata()
.
robotsMeta
to appropriate values, based on any META tags found under the
given node
.
name
property, or null if no such property exists.
ObjectMapper
instance with disabled JsonGenerator.Feature#AUTO_CLOSE_TARGET
and default
SMILA date/time format.
node
, and creates appropriate Outlink
records for each (relative to the supplied base
URL), and adds them to the outlinks
ArrayList
.
Parse
result for the given Content
.
Blackboard.getRecord(String)
+ #filterRecord(Record, String))
: Gets the blackboard record and
applies the named filter.
SolrServer
s.
Outlink
s extracted from sitemap.xml loc tag.
SolrServer
for the given core.
UnsupportedOperationException
.
#getString(Any, String)
, but throw an exception if the field does not exists.
HttpStatus.CREATED
.
StringBuffer
and a DOM Node
, and will append all the content text found
beneath the DOM node to the StringBuffer
.
getText(sb, node, false)
.
StringBuffer
and a DOM Node
, and will append the content text found beneath
the first title
node to the StringBuffer
.
#getStringRequired(Any, String)
, but throw an exception if value is not a valid name accoring to
NameValidator
.
#getValueExpression(Any, String)
, but throw an exception if the field does not exists.
PreparedStatementTypedParameter[]
thus incorporating all necessary
data for processing groupings as defined in Grouping
.JsonHttpHandler.process(Record)
to do processing and get a result record and
writes JSON result.
#process(Record)
to do processing and get a result record and
writes JSON result.
TaskContext.getInputs()
this class provides access to the data
objects associated by a task to the input slots of a worker.Inputs
and
Outputs
.true
if the URL is allowed for fetching and false
otherwise.
false
if the robots.txt
file prohibits us from accessing the given
path
, or true
otherwise.
JdbcCrawler
.
smila/jobmanager/
GET
JobRunDataProvider
.JobRunEngine
.JobRun
.JobTaskProcessor
.LinkExtractorHtml
implementations using nekohtml.LinkExtractorHtml
implementations using tagsoup.ParameterAccessor
on access to missing required parameters.DeltaService
for the jobmanager based importing framework.ObjectStoreService
.VisitedLinksService
for the jobmanager based importing framework.ODEServer
and DeploymentManager
.TaskContext.getOutputs()
this class provides access to the data
objects associated by a task to the output slots of a worker.DataSizeParser.parse(String, long)
with defaultSize 0.
parse(bytes, new XMLUtilsConfig(validate,true)).
parse(file, new XMLUtilsConfig(validate,true)).
parse(is , new XMLUtilsConfig(validate,true)).
DataFactory.autoConvertValue(Object)
instead
DefaultDataFactoryImpl.autoConvertValue(Object)
instead
TaskContext
, such as a task for this worker, input and
(if configured) output slots.
TaskContext
, such as a task for this worker, input and
(if configured) output slots.
TaskContext
, such as a task for this worker, input and
(if configured) output slots.
PipeletHolder
for execution of the
invocation.CrawlerPerformanceCounters
property logging the number of retrieved database rows.
PreparedStatement
along with the index of the
parameter within the statement und the SQL-Type of the parameter.WorkflowProcessor
s or other processing components on errors while processing a record.Record
s.Record
bulks by writing one Record
(or Any
) at a
time.SecurityAttributes.ACCESS_RIGHTS
attribute and all sub attribute.
SecurityAttributes.AccessRightType
and all it's sub attribute.
SecurityAttributes.AccessRightType
and SecurityAttributes.EntityType
and all it's sub
attributes.
SecurityAttributes.AccessRightType
and SecurityAttributes.EntityType
.
SolrServer
s.
TaskKeepAlive
gets to know that a Task
(that is currently being kept alive) has been
removed by the TaskManager.
TaskKeepAlive
gets to know that a Task
(that is currently being kept alive) has been
removed by the TaskManager.
TaskKeepAliveListener
from this instance of TaskKeepAlive
.
SolrServer
that is not needed anymore.
false
.
JettyHandlerService
interface to process requests to static web resources.SearchResultConstants.RECORDS
.
IOException
.
ServiceUnavailableException
.
robots.txt
files.SecurityAttributes.AccessRightType
.ObjectStoreService
from performing an
operation (e.g.AgentController
.
AgentController
.
AgentController
.
_baseHref
.
ClusterConfigService
.
SearchResultConstants.COUNT
.
CrawlerController
.
CrawlerController
.
CrawlerController
.
CrawlScopeFilter
for this crawl job.
SolrConstants.MORE_LIKE_THIS
.
_noCache
to true
.
_noFollow
to true
.
_noIndex
to true
.
name
property.
_refresh
to the supplied value.
_refreshHref
.
_refreshTime
.
SearchResultConstants.RUNTIME
.
Task.PROPERTY_CREATED_TIME
if not set already and it's not a finishing task.
Task.PROPERTY_START_TIME
and , if it's not a finishing task.
LinkExtractor
implementation using an HTML extractor.SimpleObjectStoreService
.ObjectStoreService
.StoreOutputStream
implementation.sitemap.xml
files.StoredAttachment.size()
if size is not known.
SolrServer
s manager for embedded Solr servers.SolrServer
s manager for non-embedded Solr servers.SolrServer
instance per core.JobRunMode.STANDARD
.
JobRunMode.STANDARD
.
DeltaService
on error in operations.ObjectStoreService
.OutputStream
to support aborting a created but not yet closed object.stream(el , validate, "UTF-8")).
stream(el , validate,
"UTF-8")).
InputStream
for reading from objectstore.OutputStream
of a data object for writing to objectstore.<invoke>
element.
<process>
element.
TaskGeneratorBase.PROPERTY_GENERATOR_NAME
.TaskStorageZk._maxNoOfTasksPerHost
.
AnyMap
with the contents of this StoreObject
.
AnyMap
with the contents of this StoreObject
.
(<field>:<token>)
.
AgentController
.
AgentController
.
AgentController
.
ClusterConfigService
.
ClusterConfigService
.
CrawlerController
.
CrawlerController
.
CrawlerController
.
sizeLimit
bytes, if necessary.
operation
attribute for sub-pipeline invokes.
portType
attribute (local-name part) for sub-pipeline invokes.
VisitedLinksService
on error in operations.JettyHandlerService
interface to process requests to static web resources.org.eclipse.smila.ws
and tries to publish them as a JAX-WS webservice.Worker.perform(TaskContext)
.WorkflowProcessor.getWorkflowDefinition(String)
results for
predefined workflows.
WorkType
filters, delimited by Select and Unselect work types.parse()
methods must be given a wellformatted XML rescource, in which the encoding is set
forth, since none of them recieves such an parameter.
|
SMILA 1.0 API documentation | ||||||||
PREV NEXT | FRAMES NO FRAMES |