My application loads a list of entities that should be processed. This happens in a class that uses a scheduler
@Component
class TaskScheduler {
@Autowired
private TaskRepository taskRepository;
@Autowired
private HandlingService handlingService;
@Scheduled(fixedRate = 15000)
@Transactional
public void triggerTransactionStatusChangeHandling() {
taskRepository.findByStatus(Status.OPEN).stream()
.forEach(handlingService::handle);
}
}
In my HandlingService
processes each task in issolation using REQUIRES_NEW
for propagation level.
@Component
class HandlingService {
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void handle(Task task) {
try {
processTask(task); // here the actual processing would take place
task.setStatus(Status.PROCCESED);
} catch (RuntimeException e) {
task.setStatus(Status.ERROR);
}
}
}
The code works only because i started the parent transaction on TaskScheduler
class. If i remove the @Transactional
annotation the entities are not managed anymore and the update to the task entity is not propagated to the db.I don't find it natural to make the scheduled method transactional.
From what i see i have two options:
1. Keep code as it is today.
2. Remove the @Transactional
annotation from the Scheduler, pass the id of the task and reload the task entity in the HandlingService.
@Component
class HandlingService {
@Autowired
private TaskRepository taskRepository;
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void handle(Long taskId) {
Task task = taskRepository.findOne(taskId);
try {
processTask(task); // here the actual processing would take place
task.setStatus(Status.PROCCESED);
} catch (RuntimeException e) {
task.setStatus(Status.ERROR);
}
}
}
@Async
Can you please offer your opinion on which is the correct way of tackling this kind of problems, maybe with another method that i didn't know about?
In this model, Spring uses AOP over the transactional methods to provide data integrity. This is the preferred approach and works in most of the cases. Support for most of the transaction APIs such as JDBC, Hibernate, JPA, JDO, JTA etc. All we need to do is use proper transaction manager implementation class.
If you call a method with a @Transactional annotation from a method with @Transactional within the same instance, then the called methods transactional behavior will not have any impact on the transaction.
Transactions indeed put locks on the database — good database engines handle concurrent locks in a sensible way — and are useful with read-only use to ensure that no other transaction adds data that makes your view inconsistent.
Explanation: Spring's transaction manager provides a technology-independent API that allows you to start a new transaction (or obtain the currently active transaction) by calling the getTransaction() method.
If your intention is to process each task in a separate transaction, then your first approach actually does not work because everything is committed at the end of the scheduler transaction.
The reason for that is that in the nested transactions Task
instances are basically detached entities (Session
s started in the nested transactions are not aware of those instances). At the end of the scheduler transaction Hibernate performs dirty check on the managed instances and synchronizes changes with the database.
This approach is also very risky, because there may be troubles if you try to access an uninitialized proxy on a Task
instance in the nested transaction. And there may be troubles if you change the Task
object graph in the nested transaction by adding to it some other entity instance loaded in the nested transaction (because that instance will now be detached when the control returns to the scheduler transaction).
On the other hand, your second approach is correct and straightforward and helps avoid all of the above pitfalls. Only, I would read the ids and commit the transaction (there is no need to keep it suspended while the tasks are being processed). The easiest way to achieve it is to remove the Transactional
annotation from the scheduler and make the repository method transactional (if it isn't transactional already).
If (and only if) the performance of the second approach is an issue, as you already mentioned you could go with asynchronous processing or even parallelize the processing to some degree. Also, you may want to take a look at extended sessions (conversations), maybe you could find it suitable for your use case.
The current code processes the task in the nested transaction, but updates the status of the task in the outer transaction (because the Task object is managed by the outer transaction). Because these are different transactions, it is possible that one succeeds while the other fails, leaving the database in an inconsistent state. In particular, with this code, completed tasks remain in status open if processing another task throws an exception, or the server is restarted before all tasks have been processed.
As your example shows, passing managed entities to another transaction makes it ambiguous which transaction should update these entities, and is therefore best avoided. Instead, you should be passing ids (or detached entities), and avoid unnecessary nesting of transactions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With