Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Issues with Spring Batch

Hi I have been working in Spring batch recently and need some help.

1) I want to run my Job using multiple threads, hence I have used TaskExecutor as below,

            @Bean
            public TaskExecutor taskExecutor() {
                SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
                taskExecutor.setConcurrencyLimit(4);
                return taskExecutor;
            }

    @Bean
        public Step myStep() {
            return stepBuilderFactory.get("myStep")
                    .<MyEntity,AnotherEntity> chunk(1)
                    .reader(reader())
                    .processor(processor())
                    .writer(writer())
                    .taskExecutor(taskExecutor())
                    .throttleLimit(4)
                     .build();
        }

but, while executing in can see below line in console.

o.s.b.c.l.support.SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.

What does this mean? However, while debugging I can see four SimpleAsyncExecutor threads running. Can someone shed some light on this?

2) I don't want to run my Batch application with the metadata tables that spring batch creates. I have tried adding spring.batch.initialize-schema=never. But it didn't work. I also saw some way to do this by using ResourcelessTransactionManager, MapJobRepositoryFactoryBean. But I have to make some database transactions for my job. So will it be alright if I use this? Also I was able to do this by extending DefaultBatchConfigurer and overriding:

@Override
    public void setDataSource(DataSource dataSource) {
        // override to do not set datasource even if a datasource exist.
        // initialize will use a Map based JobRepository (instead of database)
    }

Please guide me further. Thanks.

Update:

My full configuration class here.

@EnableBatchProcessing
@EnableScheduling
@Configuration
public class MyBatchConfiguration{

    @Autowired
    public JobBuilderFactory jobBuilderFactory;

    @Autowired
    public StepBuilderFactory stepBuilderFactory;

    @Autowired
    public DataSource dataSource;


    /* @Override
        public void setDataSource(DataSource dataSource) {
            // override to do not set datasource even if a datasource exist.
            // initialize will use a Map based JobRepository (instead of database)
        }*/
    @Bean
    public Step myStep() {

        return stepBuilderFactory.get("myStep")
                .<MyEntity,AnotherEntity> chunk(1)
                .reader(reader())
                .processor(processor())
                .writer(writer())
                .taskExecutor(executor())
                .throttleLimit(4)
                .build();
    }

    @Bean
    public Job myJob() {

        return jobBuilderFactory.get("myJob")
                .incrementer(new RunIdIncrementer())
                .listener(listener())
                .flow(myStep())
                .end()
                .build();
    }

    @Bean
    public MyJobListener myJobListener()
    {
        return new MyJobListener();
    }
    @Bean
    public ItemReader<MyEntity> reader()
    {

        return new MyReader();
    }

    @Bean
    public ItemWriter<? super AnotherEntity> writer()
    {
        return new MyWriter();
    }

    @Bean
    public ItemProcessor<MyEntity,AnotherEntity> processor()
    {
        return new MyProcessor();
    }

    @Bean
    public TaskExecutor taskExecutor() {
        SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
        taskExecutor.setConcurrencyLimit(4);
        return taskExecutor;
    }}
like image 588
Praveen Kumar Avatar asked Mar 06 '23 06:03

Praveen Kumar


1 Answers

In the future, please break this up into two independent questions. That being said, let me shed some light on both questions.

SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.

Your configuration is configuring myStep to use your TaskExecutor. What that does is it causes Spring Batch to execute each chunk in it's own thread (based on the parameters of the TaskExecutor). The log message you are seeing has nothing to do with that behavior. It has to do with launching your job. By default, the SimpleJobLauncher will launch the job on the same thread it is running on, thereby blocking that thread. You can inject a TaskExecutor into the SimpleJobLauncher which will cause the job to be executed on a different thread from the JobLauncher itself. These are two separate uses of multiple threads by the framework.

I don't want to run my Batch application with the metadata tables that spring batch creates

The short answer here is to just use an in memory database like HSQLDB or H2 for your metadata tables. This provides a production grade data store (so that concurrency is handled correctly) without actually persisting the data. If you use the ResourcelessTransactionManager, you are effectively turning transactions off (a bad idea if you're using a database in any capacity) because that TransactionManager doesn't actually do anything (it's a no-op implementation).

like image 173
Michael Minella Avatar answered Mar 28 '23 14:03

Michael Minella