Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Camel ftp consumer loads the same files again and again

I have following spring configuration

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="
       http://www.springframework.org/schema/beans 
       http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
       http://camel.apache.org/schema/spring 
       http://camel.apache.org/schema/spring/camel-spring.xsd">

    <bean id="downloadLogger" class="com.thomsonreuters.oa.sdi.camel.DownloadLogger" />

    <bean id="fileFilter" class="com.thomsonreuters.oa.sdi.camel.IgnoreReadyFilesFilter" />

    <camelContext xmlns="http://camel.apache.org/schema/spring">
        <route>
            <from uri="ftp://url_to_ftp?password=*******&amp;noop=true&amp;stepwise=false&amp;binary=true&amp;consumer.delay=10s&amp;recursive=true&amp;filter=#fileFilter" />
            <process ref="downloadLogger" />
            <to uri="file:data/outbox" />
        </route>
    </camelContext>

</beans>

At the ftp side I have 3 folders with files which I want to download. I want to achieve following scenario:

  1. On ftp is fixed amount of files (for isntance 5) at the first data pull consumer loads these files to the destination folder
  2. At the second attempt to load files, ftp state still the same (5 files) and camel ftp consumer just does nothing (except check for new files)
  3. To ftp arrives new 2 files, and at this data pull consumer downloads only these new two files

At the moment my current solutions downloads all files each time when I run dataload process, how I can manage information about downloaded files to prevent downloads of duplicates (I mean already copied files from ftp), I can write my own filter which will filter out already downloaded files but I belive there should be built in feature which will give me controle of this (maybe idempotentRepository, actually I am not sure)...

like image 477
endryha Avatar asked Apr 18 '11 20:04

endryha


3 Answers

You need to use a persistent idempotent repository if you want Camel to be able to remember which files it previously have downloaded, between restarts.

You need to set this option on the ftp endpoint: idempotentRepository

See more details here: http://camel.apache.org/file2 (Note: The FTP component inherits the options from the file component.)

There are some examples on the wiki page how to use different stores. And you can also build you custom store.

like image 76
Claus Ibsen Avatar answered Nov 19 '22 19:11

Claus Ibsen


Finally I end up with following solution:

public class SdiRouteBuilder extends RouteBuilder {
    @Override
    public void configure() throws Exception {
        from("ftp://login@url_to_ftp/RootFolder?" +
                "password=******&noop=true&stepwise=false&binary=true&consumer.delay=10s&recursive=true&filter=#fileFilter")
                .idempotentConsumer(header("CamelFileName"), FileIdempotentRepository.fileIdempotentRepository(new File("data", "repo.dat")))
                .process(new DownloadLogger())
                .to("file:data/outbox");
    }
}
like image 4
endryha Avatar answered Nov 19 '22 17:11

endryha


Maybe @endryha answer work well in 2011, but not with Camel 2.20.1

In Camel 2.20.1, these code will create two idempotentRepository

  1. ftp default memory idempotentRepository
  2. idempotentConsumer custom idempotentRepository(file based in this case)

So the correct way to use idempotentRepository is (I remove most parameter for readability)

"ftp://login@url_to_ftp/RootFolder?&idempotent=true&idempotentRepository=#myIdempotentRepo"

and a Bean

@Bean
private IdempotentRepository<String> myIdempotentRepo() {
    return FileIdempotentRepository.fileIdempotentRepository(new File("data", "repo.dat");
}
like image 1
user1686407 Avatar answered Nov 19 '22 17:11

user1686407