Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to improve SQLAlchemy performance?

I made a client application that uses HTTP to communicate with a Python 2 server using a simple API. The server uses SQLAlchemy's ORM quite extensively to serve the data for those HTTP requests. The problem is that my CPU usage is quite high even with only few active clients. This server should be able to serve a few hundred clients at the same time at around 1 request a second per client so it should still be manageable (or so I hope).

How can I improve the performance? I know the problem is the ORM as cProfile shows this quite clearly. A single query apparently executes around 10000 Python instructions which seems quite odd. I tried plugging in different database engines/backends and changed the interpreter to Pypy just for fun but it obviously didn't help the original problem and also didn't improve performance.

What am I doing wrong here? I really hope this is a "well, duh!" problem.

Should my relationships be of a different type? eager, lazy, dynamic, etc? Right now, I don't specify anything in particular.

Help greatly appreciated.

like image 869
svenstaro Avatar asked Apr 03 '12 18:04

svenstaro


1 Answers

This is a very broad question, but there are certainly some principles to adhere to when using an ORM. ORMs are expensive and can be resource intensive; just think of all the stuff going on under the covers to assemble and maintain object relationships as data is ingested into the model! How does the ORM know when to fetch data? Should it eagerly load and build all relationships when you touch the head of the object, or should it only eager load the thing you always need in that critical tight loop in your initial queries? This is where zzzeek's answer comes in; go look at the query logging and see what the ORM did. Then you'll say, oh crap!, why is it doing all that when I only need this one thing, and you'll go study eager vs. lazy loading, update your model and voila!, you'll speed up your ORM by 100x. You can also go the other way and tell it to lazy load everything, but then you may end up observing a bunch of very specific queries instead of a single bulk query, so something else is going to slow down. The general idea is to carve a path so that you don't grab bulk data until you need it, but do everything in your power to do bulk queries when you can to avoid zillions of single queries.
https://dev.to/tinazhouhui/introduction-to-object-relational-mapping-the-what-why-when-and-how-of-orm-nb2

like image 198
Kevin Fredericksen Avatar answered Oct 29 '22 01:10

Kevin Fredericksen