Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using session.query to read uncommitted data in SQLAlchemy

Summary

I'm trying write integration tests against a series of database operations, and I want to be able to use a SQLAlchemy session as a staging environment in which to validate and rollback a transaction.

Is it possible to retrieve uncommitted data using session.query(Foo) instead of session.execute(text('select * from foo'))?

Background and Research

These results were observed using SQLAlchemy 1.2.10, Python 2.7.13, and Postgres 9.6.11.

I've looked at related StackOverflow posts but haven't found an explanation as to why the two operations below should behave differently.

  • SQLalchemy: changes not committing to db

    • Tried with and without session.flush() before every session.query. No success.
  • sqlalchemy update not commiting changes to database. Using single connection in an app

    • Checked to make sure I am using the same session object throughout
  • Sqlalchemy returns different results of the SELECT command (query.all)

    • N/A: My target workflow is to assess a series of CRUD operations within the staging tables of a single session.
  • Querying objects added to a non committed session in SQLAlchemy

    • Seems to be the most related issue, but my motivation for avoiding session.commit() is different, and I didn't quite find the explanation I'm looking for.

Reproducible Example

1) I establish a connection to the database and define a model object; no issues so far:

from sqlalchemy import text
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, ForeignKey

#####
# Prior DB setup:
# CREATE TABLE foo (id int PRIMARY KEY, label text);
#####

# from https://docs.sqlalchemy.org/en/13/orm/mapping_styles.html#declarative-mapping
Base = declarative_base()

class Foo(Base):
    __tablename__ = 'foo'
    id = Column(Integer, primary_key=True)
    label = Column(String)

# from https://docs.sqlalchemy.org/en/13/orm/session_basics.html#getting-a-session
some_engine = create_engine('postgresql://username:password@endpoint/database')
Session = sessionmaker(bind=some_engine)

2) I perform some updates without committing the result, and I can see the staged data by executing a select statement within the session:

session = Session()
sql_insert = text("INSERT INTO foo (id, label) VALUES (1, 'original')")
session.execute(sql_insert);
sql_read = text("SELECT * FROM foo WHERE id = 1");
res = session.execute(sql_read).first()
print res.label

sql_update = text("UPDATE foo SET label = 'updated' WHERE id = 1")
session.execute(sql_update)
res2 = session.execute(sql_read).first()
print res2.label

sql_update2 = text("""
INSERT INTO foo (id, label) VALUES (1, 'second_update')
ON CONFLICT (id) DO UPDATE
    SET (label) = (EXCLUDED.label)
""")
session.execute(sql_update2)
res3 = session.execute(sql_read).first()
print res3.label
session.rollback()

# prints expected values: 'original', 'updated', 'second_update'

3) I attempt to replace select statements with session.query, but I can't see the new data:

session = Session()
sql_insert = text("INSERT INTO foo (id, label) VALUES (1, 'original')")
session.execute(sql_insert);
res = session.query(Foo).filter_by(id=1).first()
print res.label

sql_update = text("UPDATE foo SET label = 'updated' WHERE id = 1")
session.execute(sql_update)
res2 = session.query(Foo).filter_by(id=1).first()
print res2.label

sql_update2 = text("""
INSERT INTO foo (id, label) VALUES (1, 'second_update')
ON CONFLICT (id) DO UPDATE
    SET (label) = (EXCLUDED.label)
""")
session.execute(sql_update2)
res3 = session.query(Foo).filter_by(id=1).first()
print res3.label
session.rollback()
# prints: 'original', 'original', 'original'

I expect the printed output of Step 3 to be 'original', 'updated', 'second_update'.

like image 516
David Brakman Avatar asked Jun 18 '19 00:06

David Brakman


People also ask

What is the use of Session in SQLAlchemy?

What does the Session do? One of the core concepts in SQLAlchemy is the Session . A Session establishes and maintains all conversations between your program and the databases. It represents an intermediary zone for all the Python model objects you have loaded in it.

What is Session commit SQLAlchemy?

Session. commit() is used to commit the current transaction. It always issues Session. flush() beforehand to flush any remaining state to the database; this is independent of the “autoflush” setting.

What does Session flush do SQLAlchemy?

session. flush() communicates a series of operations to the database (insert, update, delete). The database maintains them as pending operations in a transaction.

How fetch data is used in SQLAlchemy?

To select data from a table via SQLAlchemy, you need to build a representation of that table within SQLAlchemy. If Jupyter Notebook's response speed is any indication, that representation isn't filled in (with data from your existing database) until the query is executed. You need Table to build a table.


1 Answers

The root cause is that the raw SQL queries and the ORM do not mix automatically in this case. While the Session is not a cache, meaning it does not cache queries, it does store objects based on their primary key in the identity map. When a Query returns a row for a mapped object, the existing object is returned. This is why you do not observe the changes you made in the 3rd step. This might seem like a rather poor way to handle the situation, but SQLAlchemy is operating based on some assumptions about transaction isolation, as described in "When to Expire or Refresh":

Transaction Isolation

...[So] as a best guess, it assumes that within the scope of a transaction, unless it is known that a SQL expression has been emitted to modify a particular row, there’s no need to refresh a row unless explicitly told to do so.

The whole note about transaction isolation is a worthwhile read. The way to make such changes known to SQLAlchemy is to perform updates using the Query API, if possible, and to manually expire changed objects, if all else fails. With this in mind, your 3rd step could look like:

session = Session()
sql_insert = text("INSERT INTO foo (id, label) VALUES (1, 'original')")
session.execute(sql_insert);
res = session.query(Foo).filter_by(id=1).first()
print(res.label)

session.query(Foo).filter_by(id=1).update({Foo.label: 'updated'},
                                          synchronize_session='fetch')
# This query is actually redundant, `res` and `res2` are the same object
res2 = session.query(Foo).filter_by(id=1).first()
print(res2.label)

sql_update2 = text("""
INSERT INTO foo (id, label) VALUES (1, 'second_update')
ON CONFLICT (id) DO UPDATE
    SET label = EXCLUDED.label
""")
session.execute(sql_update2)
session.expire(res)
# Again, this query is redundant and fetches the same object that needs
# refreshing anyway
res3 = session.query(Foo).filter_by(id=1).first()
print(res3.label)
session.rollback()
like image 119
Ilja Everilä Avatar answered Oct 16 '22 19:10

Ilja Everilä