I'd like to develop a route that polls a directory containing CSV files, and for every file it unmarshals each row using Bindy and queues it in activemq.
The problem is files can be pretty large (a million rows) so I'd prefer to queue one row at a time, but what I'm getting is all the rows in a java.util.ArrayList at the end of Bindy which causes memory problems.
So far I have a little test and unmarshaling is working so Bindy configuration using annotations is ok.
Here is the route:
from("file://data/inbox?noop=true&maxMessagesPerPoll=1&delay=5000")
.unmarshal()
.bindy(BindyType.Csv, "com.ess.myapp.core")
.to("jms:rawTraffic");
Environment is: Eclipse Indigo, Maven 3.0.3, Camel 2.8.0
Thank you
So, how do you open large CSV files in Excel? Essentially, there are two options: Split the CSV file into multiple smaller files that do fit within the 1,048,576 row limit; or, Find an Excel add-in that supports CSV files with a higher number of rows.
Many open source projects and closed source technologies did not withstand the tests of time and have disappeared from the middleware stacks for good. After a decade, however, Apache Camel is still here and becoming even stronger for the next decade of integration.
read_csv(chunksize) One way to process large files is to read the entries in chunks of reasonable size, which are read into the memory and are processed before reading the next chunk. We can use the chunk size parameter to specify the size of the chunk, which is the number of lines.
Apache Camel is messaging technology glue with routing. It joins together messaging start and end points allowing the transference of messages from different sources to different destinations. For example: JMS->JSON, HTTP->JMS or funneling FTP->JMS, HTTP->JMS, JMS=>JSON.
If you use the Splitter EIP then you can use streaming mode which means Camel will process the file on a row by row basis.
from("file://data/inbox?noop=true&maxMessagesPerPoll=1&delay=5000")
.split(body().tokenize("\n")).streaming()
.unmarshal().bindy(BindyType.Csv, "com.ess.myapp.core")
.to("jms:rawTraffic");
For the record and for other users which might have searched for this as much as me, meanwhile there seems to be an easier method which also works well with useMaps:
CsvDataFormat csv = new CsvDataFormat()
.setLazyLoad(true)
.setUseMaps(true);
from("file://data/inbox?noop=true&maxMessagesPerPoll=1&delay=5000")
.unmarshal(csv)
.split(body()).streaming()
.to("log:mappedRow?multiline=true");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With