I have a spring boot batch working with a MongoDB database to feed a MySQL database. I have approximately half of my database being processed by the program but only something like 200 errors in my logs.
The BATCH_STEP_EXECUTION
table let me know that the process went well (status completed) and display a READ_COUNT
of 5692 although I have 11800 documents in the database.
Did I forget something in the configuration to prevent from not going through the entire database?
Here is my configuration class:
@Configuration
@EnableBatchProcessing
@Import(PersistenceConfig.class)
public class BatchConfiguration {
@Autowired
MongoTemplate mongoTemplate;
@Autowired
SessionFactory sessionFactory;
@Bean
@StepScope
public ItemReader<CourseData> reader() {
MongoItemReader<CourseData> mongoItemReader = new MongoItemReader<>();
mongoItemReader.setTemplate(mongoTemplate);
mongoItemReader.setCollection("foo");
mongoItemReader.setQuery("{}");
mongoItemReader.setTargetType(CourseData.class);
Map<String, Sort.Direction> sort = new HashMap<>();
sort.put("_id", Sort.Direction.ASC);
mongoItemReader.setSort(sort);
return mongoItemReader;
}
@Bean
public ItemProcessor<CourseData, MatrixOne> processor() {
return new CourseDataMatrixOneProcessor();
}
@Bean
public ItemWriter<MatrixOne> writer() {
HibernateItemWriter writer = new HibernateItemWriter();
writer.setSessionFactory(sessionFactory);
System.out.println("writing stuff");
return writer;
}
@Bean
public Job importUserJob(JobBuilderFactory jobs, Step s1) {
return jobs.get("importRawCourseJob")
.incrementer(new RunIdIncrementer())
.flow(s1)
.end()
.build();
}
@Bean
@Transactional
public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<CourseData> reader, ItemWriter<MatrixOne> writer, ItemProcessor<CourseData, MatrixOne> processor) {
return stepBuilderFactory.get("step1")
.<CourseData, MatrixOne>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
}
OK so I solved it today by returning an empty POJO instead of null in my converter when something is wrong with the data. Then I just skip it in the processor.
It is kind of strange that it doesn't stop on the first null encountered though. Maybe some parallelisation of the chunk elements made me read the logs wrong
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With