Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

too much information in HttpSession

Hi what do you think about this problem?

We do have too much information in HttpSession, because much information is computed and a few large graph of objects are needed to store between requests finally.

Is it appropriate to use any cache like memcache or so? Or is it the same as increasing memory for JVM?

There's fear of storing it in DB between requests. What would you use if we are getting OutOfMemory error?

Thank you.

like image 610
blefesd Avatar asked Jan 23 '23 16:01

blefesd


2 Answers

I think the real point is the lifespan of your data.


Think about these two characteristics of the HttpSession:

  • When in a cluster, the container is responsible for replicating the HttpSession. This is good (you don't have to manage this yourself), but can be dangerous in terms of performance if this leads to too much exchanges... If your application is not clustered, forget about this point.
  • The lifespan of the HttpSession can be a few minutes or a few hours, that is while the user keeps active. This is perfect for information that has that lifespan (connection information, preferences, authorizations...). But it is not appropriate for data that is useful from one screen to the next, let's call it transient+ data.

If you have clustering needs, the database takes care of it. But beware, you can't cache anything in memory then.

Storing in the database has even longer lifespan (persistent between session, and even between reboots!), so the problem would be even worth (except you trade a memory problem for a performance problem).

I think this is the wrong approach for data whose lifespan is not expected to be persistent ...


Transient data

If data is useful only for one request, then it is typically stored in the HttpRequest, fine.

But if it is used for a few requests (interactions within one screen, or within a screen sequence like an assistant ..), the HttpRequest is too short to store it, but the HttpSession is too long. The data needs to be cleaned regularly.

And many memory problems in the HttpSession are related to such data that is transient but was not cleaned (forgotten at all, or not cleaned when an Exception, or when the user doesn't respect the regular flow: hits Back, use a previous bookmark, clic on a different menu or whatever).

Caching library to have the correct lifespan

To avoid this cleaning effort altogether (and avoid the risks of OutOfMemory when things go wrong), you can store information in a data structure that has the right lifespan. As the container doesn't provide this (it is application-related anyway), you need to implement this yourself using a cache library (like the ones mentioned; we use EhCache).

The idea is that you have a technical code (not related to one functional page, but implemented globally, such as with a ServletFilter ...) that ensures cleaning is always done after the objects are not needed any more.

You can design this cache using one (or several as needed) of the following policies for cleaning the cache. Each policy related to a functional lifespan:

  • for data only related to one screen (but several requests : reloading of the screen, Ajax requests ...), the cache can store data only for one screen at a time (for each session), call it "currentScreenCache". That guarantees that, if the user goes to another screen (even in an unmanaged way), the new screen will override the "currentScreenCache" information, and the previous information can be garbage-collected.

Implementation idea: each request must carry its screenId, and the technical code responsible for clearing the cache detects when, for the current HttpSession id, the current screenId doesn't match the one in the cache. Then it cleans or resets that item in the cache.

  • for data only used in a series of connected screens (call it a functional module), the same applies at the level of the module.

Implementation: same as before, every request has to carry the module id...

  • for data that is expensive to recompute, the cache library can be configured to store the last X computed ones (the previous ones are considered less-likely to be useful in the near-future). In typical usage, the same ones are asked for regularly, so you have many cache hits. On intensive use, the X limit is reached and the memory doesn't inflate, preventing OutOfMemory errors (at the expense of re-computation the next time).

Implementation: cache libraries support natively this limiting factor, and several more...

  • for data that is only valid for a few minutes, the cache library can natively be configured to discard it after that delay...

  • ... many more, see the caching library configuration for other ideas.

Note: Each cache can be application-wide, or specific to a user, a HttpSession id, a Company id or other functional value...

like image 95
KLE Avatar answered Feb 06 '23 11:02

KLE


It's true that HttpSession doesn't scale well but that's mainly in relation to clustering. It's a convenience but at some point yes you are better off using something like memcache or Terracotta or EHCache to persist data between requests (or between users).

like image 38
cletus Avatar answered Feb 06 '23 12:02

cletus