Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reuse / clone an sqlalchemy query

It seems to me like going through the whole process of creating the expression tree and then creating a query from it again is a wasted time when using sqlalchemy. Apart from an occasional dynamic query, almost everything will be exactly the same during the whole life of an application (apart from the parameters of course).

Is there any way to just save a query once it's created and reuse it later on with different parameters? Or maybe there's some internal mechanism which already does something similar?

like image 944
viraptor Avatar asked Nov 30 '22 01:11

viraptor


2 Answers

It seems to me like going through the whole process of creating the expression tree and then creating a query from it again is a wasted time when using sqlalchemy.

Do you have any estimates on how much time is wasted, compared to the rest of the application? Profiling here is extremely important before making your program more complex. As I will often note, Reddit serves well over one billion page views a day, they use the SQLAlchemy Core to query their database, and the last time I looked at their code they make no attempt to optimize this process - they build expression trees on the fly and compile each time. We have had users that have determined that their specific system actually benefits from optimiztions in these areas, however.

I've written up some background on profiling here: How can I profile a SQLAlchemy powered application?

Is there any way to just save a query once it's created and reuse it later on with different parameters? Or maybe there's some internal mechanism which already does something similar?

There are several methods, depending on what APIs you're using and what areas you'd like to optimize.

There's two main portions to rendering SQL - there's the construction of the expression tree, so to speak, and then the compilation of the string from the expression tree.

The tree itself, which can either be a select() construct if using Core or a Query() if using ORM, can be reused. A select() especially has nothing associated with it that prevents it from being reused as often as you like (same for insert(), delete(), update(), etc.).

In the ORM, a Query also can be used with different sessions using the with_session() method. The win here is not as much, as Query() still produces an entire select() each time it is invoked. However as we'll see below there is a recipe that can allow this to be cached.

The next level of optimization involves the caching of the actual SQL text rendered. This is an area where a little more care is needed, as the SQL we generate is specific to the target backend; there are also edge cases where various parameterizations change the SQL itself (such as using "TOP N ROWS" with SQL Server, we can't use a bound parameter there). Caching of the SQL strings is provided using the execution_options() method of Connection, which is also available in a few other places, setting the compiled_cache feature by sending it a dictionary or other dict-like object which will cache the string format of statements, keyed to the dialect, the identity of the construct, and the parameters sent. This feature is normally used by the ORM for insert/update/delete statements.

There's a recipe I've posted which integrates the compiled_cache feature with the Query, at BakedQuery. By taking any Query and saying query.bake(), you can now run that query with any Session and as long as you don't call any more chained methods on it, it should use a cached form of the SQL string each time:

q = s.query(Foo).filter(Foo.data==bindparam('foo')).bake()

for i in range(10):
    result = q.from_session(s).params(foo='data 12').all()

It's experimental, and not used very often, but it's exactly what you're asking for here. I'd suggest you tailor it to your needs, keep an eye on memory usage as you use it and make sure you follow how it works.

like image 197
zzzeek Avatar answered Dec 01 '22 14:12

zzzeek


SQLAlchemy 1.0 introduced the baked extension which is a cache designed specifically for caching Query objects. It caches the object's construction and string-compilation steps to minimize the overhead of the Python interpreter on repetitive queries.

Official docs here: http://docs.sqlalchemy.org/en/latest/orm/extensions/baked.html

Note that it does NOT in cache the result set returned by the database. For that, you'll want to check out the dogpile.cache:

http://docs.sqlalchemy.org/en/latest/orm/examples.html#module-examples.dogpile_caching

like image 35
Jeff Widman Avatar answered Dec 01 '22 14:12

Jeff Widman