public class HtmlToTextPipelet extends ATransformationPipelet
| Modifier and Type | Class and Description |
|---|---|
static class |
HtmlToTextPipelet.CommentRemover
removes comments from HTML files.
|
class |
HtmlToTextPipelet.MetadataExtractor
extract metadata from META tags.
|
class |
HtmlToTextPipelet.PlainTextWriter
Append plain text from document to a string builder.
|
_config, ENCODING_ATTACHMENT, ENCODING_CHARSET, PROP_INPUT_NAME, PROP_INPUT_TYPE, PROP_OUTPUT_NAME, PROP_OUTPUT_TYPE, PROP_OUTPUT_VALUE_TYPE| Constructor and Description |
|---|
HtmlToTextPipelet() |
| Modifier and Type | Method and Description |
|---|---|
void |
configure(AnyMap configuration)
set configuration of pipelet. called once after instantiation before the pipelet is actually used in a workflow.
|
protected java.lang.String |
getDefaultEncoding(ParameterAccessor paramAccessor) |
java.lang.String[] |
process(Blackboard blackboard,
java.lang.String[] recordIds)
process given records.
|
getInputName, getInputStream, getInputType, getOutputName, getOutputType, getOutputValueType, isReadFromAttribute, isStoreInAttribute, readInput, readStringInput, storeResult, storeResult, storeResult, storeResultsprotected java.lang.String getDefaultEncoding(ParameterAccessor paramAccessor) throws MissingParameterException
MissingParameterExceptionpublic void configure(AnyMap configuration) throws ProcessingException
configure in interface Pipeletconfigure in class ATransformationPipeletconfiguration - configuration of pipelet.ProcessingException - configuration is not applicable for pipelet (missing properties, wrong datatypes)public java.lang.String[] process(Blackboard blackboard, java.lang.String[] recordIds) throws ProcessingException
blackboard - Blackboard holding and managing the records.recordIds - Ids of records to process.ProcessingException - error during processing.