Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple Spring Batch jobs executing concurrently causing deadlocks in the Spring Batch metadata tables

We have multiple Spring Batch jobs each running in their own java instance using the CommandLineJobRunner. All of the jobs are started simultaneously, only read/write flat files and update the same Spring Batch metadata hosted in SQL Server. The only database involved is the Spring Batch metadata database.

When the multiple jobs are started simultaneously we get SQL deadlock exceptions. A more detailed stack trace can be found below. From the database perspective we can see that the deadlock victims were doing one of the following: Insert into BATCH_JOB_SEQ default values or Delete from BATCH_JOB_SEQ where ID < some_number.

We are utilizing a default MapJobRegistry, and either the default job repository or specifying the JobRepositoryFactoryBean. For the data source used to interact with the Spring Batch database we have tried both the DriverManagerDataSource or a DBCP2 pooling BasicDataSource both using the standard Microsoft SQL Server SQLServerDriver. I can upload more specific config files but in my testing as long as I use the SQL Server and standard Spring configuration the issues occur.

In my investigation I think the issue is due to how the default incrementer class, org.springframework.jdbc.support.incrementer.SqlServerMaxValueIncrementer, increments the job and step instance ids in conjunction with how the SQL Server database tables are constructed. The code in SqlServerMaxValueIncrementer is synchronized so if we were running all the jobs in the same Java instance this would not be an issue.

If we implement the Spring Batch metadata in a DB2 database we do not have a problem. The SQL Server implementation uses actual tables and the DB2 implementation uses sequence objects.

Has anyone run into this issue? Am I just missing something? It seems like whenever we have a problem like this it is as simple as go set setting xxx in yyy. If not, does anyone know why Spring Batch doesn’t implement sequence objects in the SQL Server implementation?

Stack trace:

[org.springframework.batch.core.launch.support.CommandLineJobRunner] - <Job Terminated in error: Could not increment identity; nested exception is com.microsoft.sqlserver.jdbc.SQLServerException: Transaction (Process ID 74) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.>
org.springframework.dao.DataAccessResourceFailureException: Could not increment identity; 
nested exception is com.microsoft.sqlserver.jdbc.SQLServerException: 
Transaction (Process ID 74) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
        at org.springframework.jdbc.support.incrementer.SqlServerMaxValueIncrementer.getNextKey(SqlServerMaxValueIncrementer.java:124)
        at org.springframework.jdbc.support.incrementer.AbstractDataFieldMaxValueIncrementer.nextLongValue(AbstractDataFieldMaxValueIncrementer.java:1
28)
        at org.springframework.batch.core.repository.dao.JdbcJobInstanceDao.createJobInstance(JdbcJobInstanceDao.java:108)
        at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:135)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

Configuration:

<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:jdbc="http://www.springframework.org/schema/jdbc"
xsi:schemaLocation="
    http://www.springframework.org/schema/batch
    http://www.springframework.org/schema/batch/spring-batch.xsd
    http://www.springframework.org/schema/beans
    http://www.springframework.org/schema/beans/spring-beans.xsd
    http://www.springframework.org/schema/context
    http://www.springframework.org/schema/context/spring-context.xsd">

<bean id="transactionManager"
    class="org.springframework.jdbc.datasource.DataSourceTransactionManager"
    lazy-init="true">
    <property name="dataSource" ref="batchPoolingDataSource" />
</bean>

<bean id="jobRegistry"
    class="org.springframework.batch.core.configuration.support.MapJobRegistry" />

<bean id="jobRegistryBeanPostProcessor"
    class="org.springframework.batch.core.configuration.support.JobRegistryBeanPostProcessor">
    <property name="jobRegistry" ref="jobRegistry" />
</bean>

<bean id="jobRepository"
    class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
    <property name="databaseType" value="SQLSERVER" />
    <property name="dataSource" ref="batchPoolingDataSource" />
    <property name="transactionManager" ref="transactionManager" />
</bean>

<bean id="jobLauncher"
    class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository" />
</bean>

<bean id="jobExplorer"
    class="org.springframework.batch.core.explore.support.JobExplorerFactoryBean">
    <property name="dataSource" ref="batchPoolingDataSource" />
</bean>

<bean id="jobOperator"
    class="org.springframework.batch.core.launch.support.SimpleJobOperator">
    <property name="jobExplorer" ref="jobExplorer" />
    <property name="jobLauncher" ref="jobLauncher" />
    <property name="jobRegistry" ref="jobRegistry" />
    <property name="jobRepository" ref="jobRepository" />
</bean>

<bean class="org.springframework.batch.core.scope.StepScope">
    <property name="proxyTargetClass" value="true" />
</bean>

<bean id="batchPoolingDataSource"  class="org.apache.commons.dbcp2.BasicDataSource" destroy-method="close">
    <property name="driverClassName" value="com.microsoft.sqlserver.jdbc.SQLServerDriver" />
    <property name="url" value="jdbc:sqlserver://server info" />
    <property name="username" value="${batch.jdbc.user}" />
    <property name="password" value="${batch.jdbc.password}" />
    <property name="initialSize" value="5" />
    <property name="maxTotal" value="15" />
    <property name="maxWaitMillis" value="5000" />
</bean>

<bean id="batchDataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource" >
    <property name="driverClassName" value="org.springframework.jdbc.datasource.DriverManagerDataSource" />
    <property name="url" value="jdbc:sqlserver://server info" />
    <property name="username" value="${batch.jdbc.user}" />
    <property name="password" value="${batch.jdbc.password}" />
</bean>

like image 231
user3813035 Avatar asked Oct 23 '14 14:10

user3813035


People also ask

Will Spring Batch creates it own metadata tables?

The meta table's scripts are stored in the spring-batch. jar , you need to create it manually. Run your Spring batch jobs again, those meta tables will be created automatically.

Can a Spring Batch have multiple jobs?

Multiple jobs can be run simultaneously. There are two main types of Spring Batch Parallel Processing: Single Process, Multi-threaded, or Multi-process. These are also divided into subcategories, as follows: Multi-threaded Step (Step with many threads, single process)

What is Spring Batch metadata tables?

The Spring Batch Metadata tables closely match the Domain objects that represent them in Java. For example, JobInstance , JobExecution , JobParameters , and StepExecution map to BATCH_JOB_INSTANCE , BATCH_JOB_EXECUTION , BATCH_JOB_EXECUTION_PARAMS , and BATCH_STEP_EXECUTION , respectively.

How does multithreading work in Spring Batch?

Multithreaded steps. By default, Spring Batch uses the same thread to execute a batch job from start to finish, meaning that everything runs sequentially. Spring Batch also allows multithreading at the step level. This makes it possible to process chunks using several threads.


1 Answers

After researching this further, and partially heading down the path of working on versions of the DAOs that back the JobRepository and work with SQL Server IDENTITY instead of sequences, I stumbled upon the way to address this without much more than a bit of configuration.

The simple way to address this is to configure the databaseType and isolationLevelForCreate properties of the JobRepository. Here are the settings I am using with SQL Server 2008:

<bean id="jobRepository"
    class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
    <property name="dataSource" ref="dataSource" />
    <property name="transactionManager" ref="transactionManager" />
    <property name="databaseType" value="SQLSERVER" />
    <property name="isolationLevelForCreate" value="ISOLATION_REPEATABLE_READ" />
</bean>

I've tested this with 30 jobs (same job with different parameters) being launched by a group of Quartz jobs and thus far I haven't seen any issues.

I've also kept retry code (see comment on question) in place when launching the jobs just to catch any possible deadlocks and allow it to retry. It may be a moot point but I can't risk having jobs fail to launch.

I think that mentioning these settings in the Spring Batch documentation regarding launching multiple jobs at a given time when using SQL Server as your dataSource would be quite helpful to others. Then again, I guess not many people are stuck with SQL Server.

like image 75
Joshua Moore Avatar answered Oct 22 '22 12:10

Joshua Moore