In this section, we will focus on ISB's text processing functionalities. the capabilities provided are similar to the capabilities provided by EXP EMS (Electronic Media Storage). EMS brings electronic copies of files into the workflow creating new work or modifying existing work.

A basic use of EMS is to store and disseminate reports from backend systems to end users. A more complex use of EMS is to control tasks in the workflow based on success/failure reports of the backend systems. EMS functionality currently exists in PowerImage and EXP AG.
A control file defines how input files are read and how information is used in workflow. EMS control file is similar to a script in that it defines how files are read and how information is used in workflow. Below an extract:
INP1FDBP,0,-1,FDAY,FDAY,FDAY,FDAY,,,,,,COMPLETE,TEXT,T,TEXT,FDAY.PTS.INVESTOR.INP1FDBP.*,-1,-1 INP1FDBP,-1,DATE_MMDDYY,2,98 INP1FDBP,1,POWIMAGE,RPTCOFND,REPORT#,4,s,n,,2,130 INP1FDBP,2,POWIMAGE,RPTCOFND,COMPANY#,3,s,n,,1,4 INP1FDBP,3,POWIMAGE,RPTCOFND,FUND#,3,s,n,,2,4 INP1FDBP,-4,GROUP,RPTCOFND,y
EMS uses a control File to specify details on the processing. these files are picked up by a converter that will generate a spring configuration file. the xml file will contain the definition of the routes that will be used to process data files.
convertems -convert -controlfile Control File Path -targetfile Target File Path -validate Model File Path [-verbose] convertems -validate Model File Path convertems -help
As you can see from the example command below, the control file citi_ems.fdl will processed by the converter and the generated camel routes will stored in the file citi-camel-context.xml
converter.bat "-convert" "-controlfile" "Citi_ems.fdl""-targetfile","citi-camel-context.xml"
| Parameter Name | Description |
|---|---|
| controlfile | control file location |
| targetfile | generated file location |
| validate | Enables validation of the generated xml file with an XPDL model. |
| modelfile | Model file location |
If the validation is enabled, the converter will check the data present int the model and map these data to the data provided by routes.
The dataExtractor allows to extract a specific data from a text block. It either allows to address a start point for the extraction by providing (row,column) in the text block or allows to look for a pattern targetString and starts extraction from (offsetRow,offsetColumn) from the first occurences of targetString. In both cases up to maxCharacters characters are added to the extracted string. The extracted string is added to a map with the key dataId to be passed as data to the process started. dataType indicateds the data type of the extracted metadata.Routes may use a chain of dataExtractor filters to extract multiple metadata.
The Souk project generates Camel routes for text processing. The routes includes:
<route>
<from uri="file://c:/data?filter=#fileFilter1"/>
<split streaming="true">
<method bean="linesplitterexample1" method="splitBody"/>
<!--split file and create a group for each records-->
<aggregate strategyRef="pageAssemblerexample1" aggregationRepositoryRef="pagesrepoexample1">
<correlationExpression>
<constant>true</constant>
</correlationExpression>
<completionPredicate>
<method ref="pageassemblerexample1" method="isCompleted"/>
</completionPredicate>
<!--Adding Extractors...-->
<bean ref="dataextractorexample1"/>
<!--Adding Workflow directive-->
<to uri="ipp:authenticate:setCurrent?user=motu&password=motu"/>
<to uri="ipp:process:start?processId=DataExtraction&dataMap=${body}"/>
</aggregate>
</split>
</route>
The file filter is using Spring's AntPathMatcherGenericFileFilter to specify files to be included and/or excluded. Exclude take precedence over includes. If a file match both exclude and include it will be regarded as excluded.
<bean id="fileFilter1" class="org.apache.camel.component.file.AntPathMatcherGenericFileFilter"> <property name="includes" value="M05_CMPCN_PBK*.TXT"/> </bean>
This class extracts data from a text by specifying the number of character to retrieve their location in the text (row, column) and a search type.
| Property | Description | Type |
|---|---|---|
| maxCharacters | Maximum number of character to retrieve | int |
| dataId | The data Identifier where the extracted value will be stored | String |
| column | The column value in the text | int |
| row | The row (line) value in the text | int |
| searchType | The search type:
| char |
| defaultValue | Default value to be used in case data not found | String |
<bean id="extractQTY" class="com.infinity.integration.ems.extractor.DataExtractor"> <property name="maxCharacters" value="3"/> <property name="dataId" value="QTY"/> <property name="row" value="2"/> <property name="column" value="11"/> <property name="searchType" value="S"/> <property name="defaultValue" value=""/> </bean>
Input
1STSABC123XYZ12
2BC123XYZ1200STST66
Output
QTY=200
The data extraction strategy define a list of data extraction details plus other properties (sunch as a reason code and departement in the following example).
| Property | Description | Type |
|---|---|---|
| status | Strategy status (enum?) | String |
| extractors | Data extractor object list | List(com.infinity.integration.ems.extractor.DataExtractor) |
| reasonCode | Reason code | String |
| Department | Department name | String |
In the following example, the Instrument and Qty are extracted.
<bean id="dataextractorexample2" class="com.infinity.integration.ems.extractor.DataExtractionStrategy"> <property name="status" value="PROCESS"/> <property name="extractors"> <list> <ref bean="extractorInst"/> <ref bean="extractorQty"/> </list> </property> <property name="reasonCode" value="RESAON"/> <property name="departement" value="CMPCN"/> </bean>
Split a text by lines using line break (cr) as a separator.
<bean id="lineSplitter" class="com.infinity.integration.ems.splitter.LineSplitter"/>
public class LineSplitter implements ISplitter
{
/**
* Logger for this class
*/
private static final Logger logger = Logger.getLogger(LineSplitter.class);
private final static char LINE_DELIMITER = '\n';
public List splitBody(File file) throws ServiceException
{
logger.info("splitBody --> Splitting file using LINE DELIMITER");
List response = FileUtil.retrieveContent(file, LINE_DELIMITER);
if (logger.isDebugEnabled())
{
logger.debug("\t file " + file.getAbsolutePath() + " using LINE DELIMITER");
for (Line line : response)
logger.debug("\t" + line);
}
logger.info("splitBody <-- File splitted found <" + response.size() + "> lines...");
return response;
}
}
Split a text by page using page break as separator.
<bean id="lineSplitter" class="com.infinity.integration.ems.splitter.PageBreakSplitter"/>
Break based on the number of lines???
<bean id="pageAssemble1" class="com.infinity.integration.ems.converter.assembler.PageAssembler"> <property name="pageSize" value="1"/> </bean>
The workflow directives include authentication details and the process name to start with its required input data.
<to uri="ipp:authenticate:setCurrent?user=motu&password=motu"/>
<to uri="ipp:process:start?processId=DataExtraction&dataMap=${body}"/>
| Package | Casses |
|---|---|
| com.infinity.integration.ems.converter.utils | BeansUtils.java CamelContextUtils.java FileUtil.java KeyValue.java Line.java PageUtil.java StringUtil.java |
| com.infinity.integration.ems.converter |
AttachProcessDirective.java Constant.java ConversionDirective.java Converter.java DataExtractionDirective.java DateDataExtractionDirective.java GroupingRecordDirective.java Indent.java PagesFilterDirective.java ProcessingDirective.java SplitPagesDirective.java StartProcessDirective.java StringDataExtractionDirective.java TimeDataExtractionDirective.java WorkflowDirective.java |
| com.infinity.integration.ems.converter.xml | ConversionDirectiveXmlGenerator.java ConversionDirectivesXmlGenerator.java FromBlockGenerator.java PagesFilterDirectiveXmlGenerator.java SpringBeanGenerator.java |