Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spring batch + hibernate write during select

An information up front. I just selected the code snippets i think are necessary, they reside in different files, so don't wonder if it looks a bit confusing.

I'm reading from a flat file during my SpringBatch Reader of the job. I wrote a ProductValueMapper called from the FieldSetMapper that maps the columns to the Hibernate model. This mapper also checks whether the product already exists in the database and if so, uses the Entity from the database, otherwise it will create a new one.

@Component
@StepScope
public class ProductValueMapper {

    @Autowired
    private IProductDao productDao;

    @Autowired
    private IFactory<Product> productFactory;

    private Product fetch(String[] criteria) {
       //... try to fetch product using different criteria, or create a new one using the factory ...
      return product;
    }

    Product map(String[] criteria) {
          Product product = fetch(criteria);
          //... map some stuff ...    
          return product;
    }

}

The DAO's get the entity manager Autowired by

@PersistenceContext
private EntityManager manager;

and are marked as @Transactional

Afterwards I have a processor that does nothingexcept logging.

Then I write to the default jpaItemWriter with is created like this:

@Configuration
@Import(DatabaseConfiguration.class)
public class HibernateConfiguration extends DefaultBatchConfigurer {

    @Autowired
    @Qualifier("oracleDataSource")
    private DataSource dataSource;

    @Bean(name = "jpaEntitiyManager")
    public LocalContainerEntityManagerFactoryBean entityManagerFactory() {
        LocalContainerEntityManagerFactoryBean em = new LocalContainerEntityManagerFactoryBean();
        em.setPersistenceUnitName("hibernatePersistenceUnit");
        em.setPackagesToScan("com.somepackage");
        em.setDataSource(dataSource);
        em.setJpaProperties(hibernateProperties());

        HibernateJpaVendorAdapter vendor = new HibernateJpaVendorAdapter();
        vendor.setGenerateDdl(false);
        vendor.setShowSql(true);
        em.setJpaVendorAdapter(vendor);
        return em;
    }

    @Bean
    public Properties hibernateProperties() {
        Properties prop =  new Properties();
        prop.setProperty("hibernate.hbm2ddl.auto", "validate");
        prop.setProperty("hibernate.dialect", "org.hibernate.dialect.Oracle10gDialect");
        prop.setProperty("hibernate.globally_quoted_identifiers", "false");
        prop.setProperty("hibernate.show_sql", "true");
        return prop;
    }

    @Override
    public PlatformTransactionManager getTransactionManager()  {
        final JpaTransactionManager transactionManager = new JpaTransactionManager();
        transactionManager.setEntityManagerFactory(entityManagerFactory().getObject());
    return transactionManager;
    }
}

@Configuration
@EnableBatchProcessing(modular = true)
@ComponentScan({"com.somepackage"})
@Import({HibernateConfiguration.class, DatabaseConfiguration.class})
public class BatchConfiguration {

    @Autowired
    public EntityManagerFactory emf;

    @Bean
    public JpaItemWriter<ProductEntity> jpaItemWriter() {
        JpaItemWriter<ProductEntity> itemWriter = new JpaItemWriter<>();
        itemWriter.setEntityManagerFactory(emf);
        return itemWriter;
    }

    //... rest of the setup for the job

}

The program works as expected except that with a chunksize of > 1 and an item that gets changed during a batch I get the problem that hibernate executes an update statement during the select of the following item.

I know that I can solve this by either calling flush and save in the processor or reducing the chunk size to 1, but somehow both solutions feel wrong to me. Shouldn't there be a transaction per item kept open and then when calling the writer these transactions should be commited one by one? Or am I misunderstanding the principle of the transactionHandling in Spring Batch.

* EDIT 1 *

the problem is that when setting the chunk size to 1 the program behaves as expected: The update happens during the writing phase.

2016-09-05 11:20:40.828  INFO 11084 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - beforeRead
2016-09-05 11:20:40.828  INFO 11084 --- [           main] n.e.p.i.r.map.GenericProductMapper    : Processing product: Prduct1
Hibernate: select productent0_.PRODUCTSN as PRODUCTSN1_25_, .....
2016-09-05 11:20:40.832  INFO 11084 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - afterRead: com.somepackage.ProductEntity@8e654f7
2016-09-05 11:20:40.832  INFO 11084 --- [           main] n.e.p.i.logging.LogItemWriterListener    : ItemWriteListener - beforeWrite
Hibernate: update PIME.PRODUCT set AVAILABILITYDATE=?, ....
2016-09-05 11:20:40.836  INFO 11084 --- [           main] n.e.p.i.logging.LogItemWriterListener    : ItemWriteListener - afterWrite
2016-09-05 11:20:40.887  INFO 11084 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - beforeRead
2016-09-05 11:20:40.887  INFO 11084 --- [           main] n.e.p.i.r.map.GenericProductMapper    : Processing product: Product2
Hibernate: select productent0_.PRODUCTSN as PRODUCTSN1_25_, ....
2016-09-05 11:20:40.891  INFO 11084 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - afterRead: com.somepackage.ProductEntity@2c7fb24c
2016-09-05 11:20:40.891  INFO 11084 --- [           main] n.e.p.i.logging.LogItemWriterListener    : ItemWriteListener - beforeWrite
2016-09-05 11:20:40.891  INFO 11084 --- [           main] n.e.p.i.logging.LogItemWriterListener    : ItemWriteListener - afterWrite

But when the chunk size is increased the write happens in fornt of a select statement, since the write doesn't happen at the end of processing the product but in chunks:

2016-09-05 11:09:36.240  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - beforeRead
2016-09-05 11:09:36.240  INFO 12408 --- [           main] n.e.p.i.r.map.GenericProductMapper    : Processing product: Product1
Hibernate: select productent0_.PRODUCTSN as PRODUCTSN1_25_, ....
2016-09-05 11:09:36.244  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - afterRead: com.somemodule.ProductEntity@6f28a07e
2016-09-05 11:09:36.244  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - beforeRead
2016-09-05 11:09:36.244  INFO 12408 --- [           main] n.e.p.i.r.map.GenericProductMapper    : Processing product: Product2
Hibernate: update PIME.PRODUCT set AVAILABILITYDATE=?, ....
Hibernate: select productent0_.PRODUCTSN as PRODUCTSN1_25_, ....
2016-09-05 11:09:36.250  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - afterRead: com.somemodule.ProductEntity@71852f76
2016-09-05 11:09:36.250  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - beforeRead
2016-09-05 11:09:36.250  INFO 12408 --- [           main] n.e.p.i.r.map.GenericProductMapper    : Processing product: Product3
Hibernate: select productent0_.PRODUCTSN as PRODUCTSN1_25_, ....
2016-09-05 11:09:36.253  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - afterRead: com.somemodule.ProductEntity@76ac8c3d
2016-09-05 11:09:36.253  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - beforeRead
2016-09-05 11:09:36.253  INFO 12408 --- [           main] n.e.p.i.r.map.GenericProductMapper    : Processing product: Product4
Hibernate: select productent0_.PRODUCTSN as PRODUCTSN1_25_, ....
2016-09-05 11:09:36.256  INFO 12408 --- [           main] n.e.p.i.logging.LogItemReadListener      : ItemReadListener - afterRead: com.somemodule.ProductEntity@6a0d47e8
2016-09-05 11:09:36.256  INFO 12408 --- [           main] n.e.p.i.logging.LogItemWriterListener    : ItemWriteListener - beforeWrite
2016-09-05 11:09:36.257  INFO 12408 --- [           main] n.e.p.i.logging.LogItemWriterListener    : ItemWriteListener - afterWrite
like image 409
Xtroce Avatar asked Jun 16 '26 06:06

Xtroce


1 Answers

We need to use Entry instead of Entity. The best practice for you in this case only is

  1. From Reader, you query from Database and store it as Entry (Pojo) not Entity.
  2. From Processor, you process (changes) on Entry
  3. From Writer, you update Database by Id from Entry. (or you can use Dozer for mapping from Entity to Pojo)

Otherwise, Spring will perform things as below:

  1. Reader, you fetch A and store it as active Entity A.
  2. Processor, you change directly on A entity
  3. Another reader, you fetch B and then Spring will update A because they defect the change directly on A entity.

Note: if you don't want it happened, you can use @ReadOnly along with @Transactional

Thanks, Nghia

like image 163
Nghia Do Avatar answered Jun 18 '26 21:06

Nghia Do



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!