Text Processing In Camel

In this section, we will focus on two components for text processing provided by Camel

  • camel-csv component,
  • camel-bindy component.
  • Apache Camel�s CSV data format

    Camel-csv component is used to transform; to and from CSV data format; using the Apache Commons CSV library.

    Camel uses two methods when transforming to and from CSV, FIX, and fixed-length formats:

  • Marshal: Transformation from Java object to well-known data models such as XML or CSV.
  • Unmarshal: For performing the reverse operation, which turns data from well-known formats back into a Java object.
  • Message transformation

    Example:

    The following Java DSL unmarshals data from CSV format:

    from("file://data?noop=true").unmarshal().csv().split(body).to("direct:update");

    The same example using the Spring DSL:

    <camelContext id="camel" xmlns="http://camel.apache.org/schema/spring">
      <route>
        <from uri="file://rider/csvfiles"/>
        <unmarshal><csv/></unmarshal>
        <split>
          <simple>body</simple>
          <to uri="direct:update"/>
        </split>
      </route>
    

    Apache camel-bindy

    Camel-bindy component is used to bind various data models (CSV, FIX, fixed length) to existing model objects using annotations.

    You must first define your model in a package then you can easily transform messages between each supported format and Java model objects.

    In the present document we will focus on the Unmarshal transformation.

    CSV Record

    This annotation represents the root class of the model (represents a record in CSV format).

    A record represents a line of a CSV file and can be linked to several children model classes.

    This annotation uses a separator parameter to segregate the fields in the CSV record.

    Example:

    Suppose you have a model object that represents a purchase order.

    package com.sungard.isb.bindy.csv.model;
    
    import java.math.BigDecimal;
    import java.util.Date;
    
    import org.apache.camel.dataformat.bindy.annotation.CsvRecord;
    import org.apache.camel.dataformat.bindy.annotation.DataField;
    
    @CsvRecord(separator = ",")
    public class PurchaseOrder {
    
        @DataField(pos = 1)
        private int orderNr;
    
        @DataField(pos = 2)
        private String isinCode;
    
        @DataField(name = "Name", pos = 3)
        private String instrumentName;
    
        @DataField(pos = 4, precision = 2)
        private BigDecimal amount;
    
        @DataField(pos = 5)
        private String currency;
    
        @DataField(pos = 6, pattern = "dd-MM-yyyy") //pattern used 
    	//during parsing or when the date is created
        private Date orderDate;
       //�
    }

    We explain now how to use Bindy in Camel routes. Suppose you need to consume CSV files from data directory, split out each row, and send it to direct:update endpoint URI.

    from("file://data?noop=true").split(body().tokenize("\n"))
    			   .log("Line: ${body}")
    			   .unmarshal()
    			   .bindy(BindyType.Csv,"com.sungard.isb.bindy.csv.model")
    			   .to("direct:update")

    To let Bindy know how to map a CSV line to an order model object, you need to provide a package name that will be scanned for classes annotated with Bindy annotations:
    our model is defined in the com.sungard.isb.bindy.csv.model package.

    As shown below, we have the same example with in Spring DSL:

    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="
    http://www.springframework.org/schema/beans
    http://www.springframework.org/schema/beans/spring-beans.xsd
    http://camel.apache.org/schema/spring
    http://camel.apache.org/schema/spring/camel-spring.xsd">
      <bean id="csvBindyDataformat" class="org.apache.camel.dataformat.bindy.csv.BindyCsvDataFormat">
        <constructor-arg value="com.sungard.isb.bindy.csv.model"/>
    
      </bean>
      <camelContext xmlns="http://camel.apache.org/schema/spring">
        <jmxAgent id="agent" disabled="false"/>
        <route>
          <from uri="file://data?noop=true"/>
          <unmarshal ref="csvBindyDataformat"/>
          <to uri="activemq:queue:in"/>
        </route>
      </camelContext>
    <beans>
    

    Bindy uses a Map to represent a CSV row and the collection of model objects modeled in Camel by the Exchange. The processor given below uses a logPurchaseOrder method.

    process(new Processor() { public void process(Exchange exchange) throws Exception {
    			logPurchaseOrder ((List>) exchange.getIn().getBody());
    				}
    			})

    The collection is a list of Map. Each Map of the list contains the model�s objects.

    public static void logPurchaseOrder(List> orders) throws Exception {
    				Logger logger = Logger.getLogger(BindyCSVGenerateObjectUnmarshall.class); 
    				 for (HashMap order : orders) {
    					for (String key : order.keySet()) {
    						logger.info("toString:key<" + key + ">, " + order.get(key));
    					}
    				  }
    				}

    Unit Test

    The example below shows how to test our example with camel-bindy:

    @ContextConfiguration (locations ="com.sungard.isb.bindy.csv.test.BindyConfig$ContextConfig")
    
    public class BindyCSVGenerateObjectUnmarshallDsl extends
    		AbstractJUnit4SpringContextTests {
    
    	@Produce(uri = "direct:start")
    	private ProducerTemplate template;
    
    	@EndpointInject(uri = "mock:result")
    	private MockEndpoint resultEndpoint;
    
    	@Test
    	public void testCsvMarshal() throws Exception {
    
    		template.sendBody("direct:start", createBody());
    		resultEndpoint.expectedMessageCount(1);
    		resultEndpoint.assertIsSatisfied();
    
    		@SuppressWarnings("unchecked")
    		List> rows = resultEndpoint.getReceivedExchanges().get(0).getIn().getBody(List.class);
    		PurchaseOrder order = (PurchaseOrder) rows.get(0).get(PurchaseOrder.class.getName());
    		Assert.assertEquals("BE125", order.getIsinCode());
    		Assert.assertEquals("IBM", order.getInstrumentName());
    		Assert.assertEquals("EURO", order.getCurrency());
    	}
    
    	private String createBody() {
    		String csvContent = "562,BE125,IBM,150,EURO";
    		return csvContent;
    	}
    
    	public static class ContextConfig extends RouteBuilder {
    
    		public void configure() {
    			BindyCsvDataFormat camelDataFormat = new BindyCsvDataFormat(
    					"com.sungard.isb.bindy.csv.model");
    			from("direct:start").unmarshal(camelDataFormat).to("mock:result");
    		}
    	}
    }

    Location property in the ContextConfiguration annotation refers to configuration file given below:

    <camelContext xmlns="http://camel.apache.org/schema/spring">
      <route>
        <from uri="direct:start"/>
        <unmarshal ref="csvBindyDataformat"/>
        <to uri="mock:result"/>
      </route>
    </camelContext>
    <bean id="csvBindyDataformat" class="org.apache.camel.dataformat.bindy.csv.BindyCsvDataFormat">
      <constructor-arg  value="com.sungard.isb.bindy.csv.model"/>
    </bean>
    

    Fixed Length Record

    Bindy is equally capable of working with fixed-length data format. File/message, used with this annotation, containing data fixed length formatted.

    Example

    We will use the same model used in the previous section to parse/format a fixedlength message:

    package com.sungard.isb.bindy.fixedlength.model;
    
    import java.math.BigDecimal;
    import java.util.Date;
    import org.apache.camel.dataformat.bindy.annotation.DataField;
    import org.apache.camel.dataformat.bindy.annotation.FixedLengthRecord;
    
    @FixedLengthRecord(length=54, paddingChar=' ')
    public class PurchaseOrder {
    
        @DataField(pos = 1, length=2)
        private int orderNr;
    
        @DataField(pos = 3, length=2)
        private String clientNr;
    
        @DataField(pos = 5, length=7)
        private String firstName;
    
        @DataField(pos = 12, length=1, align="L")
        private String lastName;
    
        @DataField(pos = 13, length=4)
        private String instrumentCode;
    
        @DataField(pos = 17, length=10)
        private String instrumentNumber;
    
        @DataField(pos = 27, length=3)
        private String orderType;
    
        @DataField(pos = 30, length=5)
        private String instrumentType;
    
        @DataField(pos = 35, precision = 2, length=7)
        private BigDecimal amount;
    
        @DataField(pos = 42, length=3)
        private String currency;
    
        @DataField(pos = 45, length=10, pattern = "dd-MM-yyyy")
        private Date orderDate;
    //�
    }

    The Java DSL Camel route:

    BindyFixedLengthDataFormat camelDataFormat = new BindyFixedLengthDataFormat("com.sungard.isb.bindy.fixedlength.model");
    				from("file://data?noop=true").split(body().tokenize("\n"))
    				.split(body(String.class).tokenize("\n"))
    				.log("*** Before Transformer: ${body}")
    				.unmarshal(camelDataFormat).to("direct:update").end();

    FIX messages

    For binding FIX data format Camel-bindy component uses mainly two following annotation:

    1. Message: The Message annotation is used to identify the class of your model that contains key value pairs fields.
      This kind of format is used mainly in Financial Exchange Protocol Messages (FIX).
      The key pair values are separated each other by a separator which can be a special character like a tab delimitor (unicode representation: \u0009).
    2. KeyValuePairField: The KeyValuePairField annotation defines the property of a key value pair field.
      Each KeyValuePairField is identified by a tag (= key) and its value associated.

    Example:

    We will use, also, the same model used in the CSV section to parse/format a FIX message:

    package com.sungard.isb.bindy.fix.model;
    
    import org.apache.camel.dataformat.bindy.annotation.KeyValuePairField;
    import org.apache.camel.dataformat.bindy.annotation.Message;
    
    @Message(keyValuePairSeparator = "=", pairSeparator = "\u0001", type="FIX", version="4.1")
    public class PurchaseOrder {
    
        @KeyValuePairField(tag = 1) // Client reference
        private String Account;
    
        @KeyValuePairField(tag = 11) // Order reference
        private String ClOrdId;
        
        @KeyValuePairField(tag = 22) // Fund ID type (Sedol, ISIN, ...)
        private String IDSource;
        
        @KeyValuePairField(tag = 48) // Fund code
        private String SecurityId;
        
        @KeyValuePairField(tag = 54) // Movement type ( 1 = Buy, 2 = sell)
        private String Side;
        
        @KeyValuePairField(tag = 58) // Free text
        private String Text;
    //...
    }

    The Camel route Java DSL:

    BindyKeyValuePairDataFormat fixBindyDataFormat = new BindyKeyValuePairDataFormat("com.sungard.isb.bindy.fix.model");
    	
    	from("file://data?noop=true").split(body().tokenize("\n"))
    				   .split(body(String.class).tokenize("\n"))
    				   .log("*** Before Transformer: ${body}")
    				   .unmarshal(fixBindyDataFormat)
    				   .to("direct:update");