Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scalable http session management (java, linux)

Is there a best-practice for scalable http session management?

Problem space:

  • Shopping cart kind of use case. User shops around the site, eventually checking out; session must be preserved.
  • Multiple data centers
  • Multiple web servers in each data center
  • Java, linux

I know there are tons of ways doing that, and I can always come up with my own specific solution, but I was wondering whether stackoverflow's wisdom of crowd can help me focus on best-practices

In general there seem to be a few approaches:

  • Don't keep sessions; Always run stateless, religiously [doesn't work for me...]
  • Use j2ee, ejb and the rest of that gang
  • use a database to store sessions. I suppose there are tools to make that easier so I don't have to craft all by myself
  • Use memcached for storing sessions (or other kind of intermediate, semi persistent storage)
  • Use key-value DB. "more persistent" than memcached
  • Use "client side sessions", meaning all session info lives in hidden form fields, and passed forward and backward from client to server. Nothing is stored on the server.

Any suggestions? Thanks

like image 268
Ran Avatar asked Jul 29 '09 07:07

Ran


3 Answers

I would go with some standard distributed cache solution. Could be your application server provided, could be memcached, could be terracotta Probably doesn't matter too much which one you choose, as long as you are using something sufficiently popular (so you know most of the bugs are already hunted down).

As for your other ideas:

  • Don't keep session - as you said not possible
  • Client Side Session - too unsecure - suppose someone hacks the cookie to put discount prices in the shopping cart
  • Use database - databases are usually the hardest bottleneck to solve, don't put any more there than you absolutely have to.

Those are my 2 cents :)

Regarding multiple data centers - you will want to have some affinity of the session to the data center it started on. I don't think there are any solutions for distributed cache that can work between different data centers.

like image 78
Gregory Mostizky Avatar answered Nov 07 '22 22:11

Gregory Mostizky


You seem to have missed out vanilla replicated http sessions from your list. Any servlet container worth its salt supports replication of sessions across the cluster. As long as the items you put into the session aren't huge, and are serializable, then it's very easy to make it work.

http://tomcat.apache.org/tomcat-6.0-doc/cluster-howto.html

edit: It seems, however, that tomcat session replication doesn't scale well to large clusters. For that, I would suggest using JBoss+Tomcat, which gives the idea of "buddy replication":

http://www.jboss.org/community/wiki/BuddyReplicationandSessionData

like image 25
skaffman Avatar answered Nov 07 '22 23:11

skaffman


I personally haven't managed such clusters, but when I took a J2EE course at the university the lecturer said to store sessions in a database and don't try to cache it. (You can't meaningfully cache dynamic pages anyway.) Http sessions are client-side by the definition, as the session-id is a cookie. If the client refuses to store cookies (e.g. he's paranoid about tracking), then he can't have a session. You can get this id by calling HttpSession.getId().

Of course database is a bottleneck, so you'll end up with two clusters: an application server cluster and a database cluster.

As far as I know, both stateful message beans and regular servlet http sessions exist only in memory without load balancing built in.

Btw. I wouldn't store e-mail address or usernames in a hidden field, but maybe the content of the cart isn't that sensitive data.

like image 1
zslevi Avatar answered Nov 07 '22 22:11

zslevi