Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running Spring batch on multiple databases in parallel

I've created a Spring batch application using Spring boot, and I have a Job with 9 steps. These steps are using a DataSource which I created its bean in a configuration file as follows:

@Configuration
public class DatabaseConfig {
    @ConfigurationProperties(prefix = "spring.datasource")
    @Bean
    @Primary
    public DataSource dataSource(){
        return DataSourceBuilder.create().build();
    }
}

This DataSource is using properties declared in the application.yml file:

spring:
  datasource:
    url: jdbc:mysql://localhost:3306/db_01?zeroDateTimeBehavior=convertToNull
    username: xxxx
    password: ****

So far, all works as expected.

What I want to do, is that I have 4 databases parameterized in a 5th database (db_settings), which I select using an SQL query. This query will return the 4 databases with their usernames and passwords as follows:

+--------+-----------------------------------+-----------------+-----------------+
| id     | url                               | username_db     | password_db     |
+--------+-----------------------------------+-----------------+-----------------+
|    243 | jdbc:mysql://localhost:3306/db_01 | xxxx            | ****            |
|    244 | jdbc:mysql://localhost:3306/db_02 | xxxx            | ****            |
|    245 | jdbc:mysql://localhost:3306/db_03 | xxxx            | ****            |
|    247 | jdbc:mysql://localhost:3306/db_04 | xxxx            | ****            |
+--------+-----------------------------------+-----------------+-----------------+

So instead of running the steps using the database declared in 'application.yml', I want to run them on all the 4 databases. And considering the volume processed, it is necessary to be able to launch the batch processing on these databases in parallel.

Does anyone know how to implement this?

like image 468
Renaud is Not Bill Gates Avatar asked Jun 08 '17 11:06

Renaud is Not Bill Gates


People also ask

Does Spring Batch run in parallel?

Spring Batch Parallel Processing is each chunk in its own thread by adding a task executor to the step. If there are a million records to process and each chunk is 1000 records, and the task executor exposes four threads, you can handle 4000 records in parallel instead of 1000 records.

Can we use 2 database in spring boot?

Spring boot allows you to connect to multiple databases by configuring multiple data sources in a single spring boot application using hibernate and JPA. Spring boot enables repositories to connect to multiple databases using JPA from a single application.


1 Answers

Where is the bounty? :-)


Thanks KeatsPeeks, AbstractRoutingDataSource is a good starter for the solution, and here is a good tutorial on this part.

Mainly the important parts are:

  1. define the lookup code

public class MyRoutingDataSource extends AbstractRoutingDataSource { @Override protected Object determineCurrentLookupKey() { String language = LocaleContextHolder.getLocale().getLanguage(); System.out.println("Language obtained: "+ language); return language; } }

  1. register the multiple datasource

    <bean id="abstractDataSource" class="org.apache.commons.dbcp.BasicDataSource"
        destroy-method="close"
        p:driverClassName="${jdbc.driverClassName}"
        p:username="${jdbc.username}"
        p:password="${jdbc.password}" />
    
    <bean id="concreteDataSourceOne"
        parent="abstractDataSource"
        p:url="${jdbc.databaseurlOne}"/>
    
     <bean id="concreteDataSourceTwo"
        parent="abstractDataSource"
        p:url="${jdbc.databaseurlTwo}"/>
    

So after that, the problem is become to:

  1. How to load datasource config properties when spring startup and config the corresponding dataSource using the config properties in database.

  2. How to use multiple dataSource in spring batch

    Actually when I try to google it, seems this is a most common case, google give the suggestion search words - "spring batch multiple data sources", there are a lots articles, so I choose the answer in

  3. How to define the lookup code based on the spring batch jobs(steps)

    Typically this should be a business point, You need define the lookup strategy and can be injected to the com.example.demo.datasource.CustomRoutingDataSource#determineCurrentLookupKey to routing to the dedicated data source.

Limitation

The really interesting is actually it is supports the multiple dataSource, but the db settings cannot store in the DB indeed. The reason is it will get the cycle dependencies issue:

The dependencies of some of the beans in the application context form a cycle:

   batchConfiguration (field private org.springframework.batch.core.configuration.annotation.JobBuilderFactory com.example.demo.batch.BatchConfiguration.jobs)
      ↓
   org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration (field private java.util.Collection org.springframework.batch.core.configuration.annotation.AbstractBatchConfiguration.dataSources)
┌─────┐
|  routingDataSource defined in class path resource [com/example/demo/datasource/DataSourceConfiguration.class]
↑     ↓
|  targetDataSources defined in class path resource [com/example/demo/datasource/DataSourceConfiguration.class]
↑     ↓
|  myBatchConfigurer (field private java.util.Collection org.springframework.batch.core.configuration.annotation.AbstractBatchConfiguration.dataSources)
└─────┘

So obviously the solution is break the dependency between dataSource and routingDataSource

  • Save the DB setting in properties
  • Or involve other approach but not in the primary dataSource

See Also

https://scattercode.co.uk/2013/11/18/spring-data-multiple-databases/ https://numberformat.wordpress.com/2013/12/27/hello-world-with-spring-batch-3-0-x-with-pure-annotations/

http://spring.io/guides/gs/batch-processing/

How to java-configure separate datasources for spring batch data and business data? Should I even do it?

Github to get the codes.

like image 106
Liping Huang Avatar answered Sep 27 '22 19:09

Liping Huang