Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I pass a managed entity to a method that requires a new transaction?

My application loads a list of entities that should be processed. This happens in a class that uses a scheduler

@Component
class TaskScheduler {

    @Autowired
    private TaskRepository taskRepository;

    @Autowired
    private HandlingService handlingService;

    @Scheduled(fixedRate = 15000)
    @Transactional
    public void triggerTransactionStatusChangeHandling() {
        taskRepository.findByStatus(Status.OPEN).stream()
                               .forEach(handlingService::handle);
    }
}

In my HandlingService processes each task in issolation using REQUIRES_NEW for propagation level.

@Component
class HandlingService {

    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public void handle(Task task) {
        try {
            processTask(task); // here the actual processing would take place
            task.setStatus(Status.PROCCESED);
        } catch (RuntimeException e) {
            task.setStatus(Status.ERROR);
        }
    }
}

The code works only because i started the parent transaction on TaskScheduler class. If i remove the @Transactional annotation the entities are not managed anymore and the update to the task entity is not propagated to the db.I don't find it natural to make the scheduled method transactional.

From what i see i have two options:

1. Keep code as it is today.

  • Maybe it`s just me and this is a correct aproach.
  • This varianthas the least trips to the database.

2. Remove the @Transactional annotation from the Scheduler, pass the id of the task and reload the task entity in the HandlingService.

@Component
class HandlingService {

    @Autowired
    private TaskRepository taskRepository;

    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public void handle(Long taskId) {
        Task task = taskRepository.findOne(taskId);
        try {
            processTask(task); // here the actual processing would take place
            task.setStatus(Status.PROCCESED);
        } catch (RuntimeException e) {
            task.setStatus(Status.ERROR);
        }
    }
}
  • Has more trips to the database (one extra query/element)
  • Can be executed using @Async

Can you please offer your opinion on which is the correct way of tackling this kind of problems, maybe with another method that i didn't know about?

like image 414
mvlupan Avatar asked Apr 06 '16 13:04

mvlupan


People also ask

Which Transaction Manager implementation would be most appropriate?

In this model, Spring uses AOP over the transactional methods to provide data integrity. This is the preferred approach and works in most of the cases. Support for most of the transaction APIs such as JDBC, Hibernate, JPA, JDO, JTA etc. All we need to do is use proper transaction manager implementation class.

What happens if one @transactional annotated method is calling another @transactional annotated method on the same object instance?

If you call a method with a @Transactional annotation from a method with @Transactional within the same instance, then the called methods transactional behavior will not have any impact on the transaction.

Why is it best practice to mark transactions read only?

Transactions indeed put locks on the database — good database engines handle concurrent locks in a sensible way — and are useful with read-only use to ensure that no other transaction adds data that makes your view inconsistent.

Which Spring method is used to start a new transaction?

Explanation: Spring's transaction manager provides a technology-independent API that allows you to start a new transaction (or obtain the currently active transaction) by calling the getTransaction() method.


2 Answers

If your intention is to process each task in a separate transaction, then your first approach actually does not work because everything is committed at the end of the scheduler transaction.

The reason for that is that in the nested transactions Task instances are basically detached entities (Sessions started in the nested transactions are not aware of those instances). At the end of the scheduler transaction Hibernate performs dirty check on the managed instances and synchronizes changes with the database.

This approach is also very risky, because there may be troubles if you try to access an uninitialized proxy on a Task instance in the nested transaction. And there may be troubles if you change the Task object graph in the nested transaction by adding to it some other entity instance loaded in the nested transaction (because that instance will now be detached when the control returns to the scheduler transaction).

On the other hand, your second approach is correct and straightforward and helps avoid all of the above pitfalls. Only, I would read the ids and commit the transaction (there is no need to keep it suspended while the tasks are being processed). The easiest way to achieve it is to remove the Transactional annotation from the scheduler and make the repository method transactional (if it isn't transactional already).

If (and only if) the performance of the second approach is an issue, as you already mentioned you could go with asynchronous processing or even parallelize the processing to some degree. Also, you may want to take a look at extended sessions (conversations), maybe you could find it suitable for your use case.

like image 186
Dragan Bozanovic Avatar answered Sep 29 '22 09:09

Dragan Bozanovic


The current code processes the task in the nested transaction, but updates the status of the task in the outer transaction (because the Task object is managed by the outer transaction). Because these are different transactions, it is possible that one succeeds while the other fails, leaving the database in an inconsistent state. In particular, with this code, completed tasks remain in status open if processing another task throws an exception, or the server is restarted before all tasks have been processed.

As your example shows, passing managed entities to another transaction makes it ambiguous which transaction should update these entities, and is therefore best avoided. Instead, you should be passing ids (or detached entities), and avoid unnecessary nesting of transactions.

like image 42
meriton Avatar answered Sep 29 '22 07:09

meriton