Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

(N)Hibernate "session-per-application" considered evil for specific use case?

Ok, everyone knows that a global session-per-application with (N)Hibernate is discouraged. BUT I have a very specific, apparently non-standard use case for which it seems to be the ideal solution.

To sum it up, my (server) application basically has all of its persistent data constantly in-memory and never queries the database for normal operation. The only reason for a database in the first place is so that the data survives the lifetime of the process. I only want to query the database on application startup to fetch everything into memory. The database is only about 5-10 MB realistically.

Now the problem is that if I follow the advice that sessions must be short lived, I have to merge() all my data for every business transaction or somehow manually track all changes, instead of taking advantage of NHibernate's automatic change tracking. This makes persistence very difficult to implement without causing a lot of performance overhead.

So my question is whether there are any reasons why I shouldn't use a global session for this particular use case?

Common arguments against global sessions that I know of:

  1. First level cache will be filled with entire database over time => I don't mind that, since I actually want to have all data in memory!

  2. Stale data and concurrency problems => My application is designed so that all code that can access or modify persistent data must be single threaded (a deliberate design choice), and it is the only application that can write to the database. So this shouldn't be a problem.

  3. Session gets corrupted if it throws an exception (e.g. DB timeout) => That's the only real problem I can see, but can be solved by discarding the session, creating a new one and refreshing all data. Expensive, but exceptions should be very rare and can only be caused either by a major bug or major infrastructure problems, which should both be solved ASAP.

So I believe there is no reason why I shouldn't use a global session for my particular use case. Or is there something important that I'm missing?

Update 1: It's a server application

Update 2: This doesn't imply a long-lived global transactions. Transactions would still be short-lived - one long-lived session, many short-lived transactions.

like image 519
chris Avatar asked Aug 18 '15 13:08

chris


3 Answers

If you fan-in all transactions coming from multiple threads to a single dedicated back-end thread executor, then you can indeed use a single Session per application.

Exceptions can be triggered by lock timeouts, server crashes or constraint violations, so retreating the backing Session would lead to discarding all the first-level cache entries, which is bad for your use case. In this case, you will have to re-fetch everything from the DB and because you use a single back-end thread, all the other client threads will be blocked, which is unconvincing.

I would advise you using the second-level cache instead. You can configure the 2LC provider to stre evryhting in-memory, instead of overflowing to the disk. You can load all the data in 2nd level cache when the application starts and use a NONSTRICT_READ_WRITE Cache Concurrency Strategy to speed-up writes (Concurrency issues aren't a problem for you anyway).

You need to make sure you use 2NL caching for collections too.

The simplest design is to use a session-per-request as the Session is kinda lightweight anyway and it will fetch data from the in-memory 2LC anyway.

You need to run some performance tests to see if it's worth to reusing a Session, instead of creating a new one on every single transaction. You might find out that this process is not your bottleneck anyway and you shouldn't do any optimization without a real proof.

Another reason for discarding the session is that most database-related exceptions are not recoverable anyway. If the server gets down or the current request throws a constraint violation, retying it won't fix anything anyway.

like image 114
Vlad Mihalcea Avatar answered Sep 28 '22 06:09

Vlad Mihalcea


One of the potential disadvantages I can see is that dirty check may take a long time to execute; you will have to use bytecode instrumentation mode to solve this issue.

Also, the serialized access to the server may affect performance much more than recreating the objects from the second-level cache (object creation is very fast in modern JVMs). This is even true in single-user applications (the user may trigger a long-runnig operation on one screen and want to do something else on the other, or the server may trigger a scheduled operation thus blocking the user to access the server until the operation is done).

Thirdly, rearchitecturing your approach may be hard later if the need arises to execute concurrent requests after all.

Next, you will not avoid going to the database when executing queries anyway.

Finally, extended session is a Hibernate feature that is not used as common as the classic session-per-request pattern. Although Hibernate is a good piece of software, it is also complex and thus has many bugs. I would expect more bugs and weaker community support/documentation related to less-used features (as in every other framework).

So my suggestion is to use the second-level cache and to handle the concurrency issues using optimistic/pessimistic locks depending on the use cases. You can enable caching of all entities by default by using DISABLE_SELECTIVE or ALL shared cache mode, as described in the docs.

like image 44
Dragan Bozanovic Avatar answered Sep 28 '22 06:09

Dragan Bozanovic


The reasons for not using global session can be summarized as below:

* First level cache: You have to understand. First level cache is not just about memory. A consequence of first level cache is when ever an object is saved or delete or queried, (n)hibernate has to ensure that prior to this event, the database has to be in a consistent state with your memory. That is flushing. However, (n)hibernate has one unique feature, that is called Transparent persistence. Unlike other ORM's like entity framework, you don't have track what is dirty. NHiberante does that for you. But the way it works is somewhat costly. It compares all previous object states with the new ones and try to detect what has change. So if your first level cache is full of entities the performance will degrade. This problem can be circumvented in two ways:

1-) using session.Flush(); session.Clear();

2-) using stateless session.

In the first case all your pending changes go to database. After that you can safely clear the session. However keeps the data in memory even if your clear the session until the transaction is disposed. That is due to the possibility that you can vote the transaction to be rejected at the end. (Can give more info on this if requested)

Second case, for stateless session, (n) hibernate behaves like a micro ORM. There is no tracking. It is lightweight but gives you more responsibility.

* Session related exceptions: Another valid reason for not using application wide session is, whenever an exception occurs related to session or database (such as unique constraint violation) then your session is doomed. You cannot work with it any more because it is in inconsistent state with the database. So your global session has to be renewed and this brings more complication.

* Thread Safety: Neither session nor any ADO.NET construct is thread safe. If you use a global session object along with multiple threads, you have to ensure some sort of thread safety. That can be very very difficult.

like image 29
Onur Gumus Avatar answered Sep 28 '22 07:09

Onur Gumus