Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practice for integrating CherryPy web-framework, SQLAlchemy sessions and lighttpd to serve a high-load webservice

I'm developing a CherryPy FastCGI server behind lighttpd with the following setup to enable using ORM SQLAlchemy sessions inside CherryPy controllers. However, when I run stress tests with 14 concurrent requests for about 500 loops, it starts to give errors like AttributeError: '_ThreadData' object has no attribute 'scoped_session_class' in open_dbsession() or AttributeError: 'Request' object has no attribute 'scoped_session_class' in close_dbsession() after a while. The error rate is around 50% in total.

This happens only when I run the server behind lighttpd, not when it's run directly through cherrypy.engine.start(). It's confirmed that connect() isn't raising exceptions.

I also tried assigning the return value of scoped_session to GlobalSession (like it does here), but then it gave out errors like UnboundExceptionError and other SA-level errors. (Concurrency: 10, loops: 1000, error rate: 16%. Occurs even when run directly.)

There are some possible causes but I lack sufficient knowledge to pick one.
1. Are start_thread subscriptions unreliable under FastCGI environment? It seems that open_dbsession() is called before connect()
2. Does cherrypy.thread_data get cleared for some reason?

server code

import sqlalchemy as sa  
from sqlalchemy.orm import session_maker, scoped_session

engine = sa.create_engine(dburi, strategy="threadlocal")  
GlobalSession = session_maker(bind=engine, transactional=False)

def connect(thread_index):  
    cherrypy.thread_data.scoped_session_class = scoped_session(GlobalSession)

def open_dbsession():  
    cherrypy.request.scoped_session_class = cherrypy.thread_data.scoped_session_class

def close_dbsession():  
    cherrypy.request.scoped_session_class.remove()


cherrypy.tools.dbsession_open = cherrypy.Tool('on_start_resource', open_dbsession)  
cherrypy.tools.dbsession_close = cherrypy.Tool('on_end_resource', close_dbsession)  
cherrypy.engine.subscribe('start_thread', connect)

lighttpd fastcgi config

...
var.server_name = "test"
var.server_root = "/path/to/root"
var.svc_env = "test"
fastcgi.server = (
  "/" => (
    "cherry.fcgi" => (
      "bin-path" => server_root + "/fcgi_" + server_name + ".fcgi",
      "bin-environment" => (
        "SVC_ENV" => svc_env
      ),
      "bin-copy-environment" => ("PATH", "LC_CTYPE"),
      "socket" => "/tmp/cherry_" + server_name + "." + svc_env + ".sock",
      "check-local" => "disable",
      "disable-time"    => 1,
      "min-procs"       => 1,
      "max-procs"       => 4,
    ),
  ),
)

edits

  • Restored the missing thread_index argument in the code example from the original source code (thanks to the comment)
  • Clarified that errors do not occur immediately
  • Narrowed down the conditions to lighttpd
like image 886
ento Avatar asked Mar 09 '09 07:03

ento


1 Answers

If you look at plugins.ThreadManager.acquire_thread, you'll see the line self.bus.publish('start_thread', i), where i is the array index of the seen thread. Any listener subscribed to the start_thread channel needs to accept that i value as a positional argument. So rewrite your connect function to read: def connect(i):

My guess it that's failing silently somehow; I'll see if I can track that down, test and fix it.

like image 60
fumanchu Avatar answered Sep 28 '22 20:09

fumanchu