I have a function, that does multiple queries on the same dataset and I want to ensure all the queries would see exactly the same data.
In terms of SQL, this means REPEATABLE READ isolation level for the databases that support it. I don't mind having higher level or even a complete lockdown if the database isn't capable.
As far as I see, this isn't the case. I.e. if I run something like this code in one Python shell:
with transaction.atomic():
for t in range(0, 60):
print("{0}: {1}".format(t, MyModel.objects.count()))
time.sleep(1)
As soon as I do MyModel.objects.create(...)
in another, the value seen by the running loop increase immediately. Which is exactly what I want to avoid. Further tests shows the behavior matches READ COMMITTED level, which is too lax for my tastes.
I'd also want to stress the point, I want stricter isolation level only for a single function, not for the whole project.
What are my best options to achieve this?
In my particular case, the only database I care of is PostgreSQL 9.3+, but I also want some compatibility with SQLite3 in which case even completely locking the whole database is okay with me. Yet, obviously, the more general the solution is, the more preferred it is.
To achieve repeatable read we use the same visibility (and locking) checks as read committed ( ReadCommittedTransaction ), but we add a read lock on every record that is read (and hence, visible) by us.
Repeatable read is a higher isolation level, that in addition to the guarantees of the read committed level, it also guarantees that any data read cannot change, if the transaction reads the same data again, it will find the previously read data in place, unchanged, and available to read.
Nonrepeatable Reads A nonrepeatable read occurs when a transaction reads the same row twice but gets different data each time. For example, suppose transaction 1 reads a row. Transaction 2 updates or deletes that row and commits the update or delete.
The Repeatable Read mode provides a rigorous guarantee that each transaction sees a completely stable view of the database. However, this view will not necessarily always be consistent with some serial (one at a time) execution of concurrent transactions of the same level.
You're right, default transaction isolation level in postgres is READ COMMITTED. You can easily change it in settings to test whether it would fit your needs: https://docs.djangoproject.com/en/1.8/ref/databases/#isolation-level
Also I doubt you will face some performance issues because postgres operates very efficiently while working with transactions. Even in SERIALIZABLE mode. Also mysql has REPEATABLE READ default isolation level and as we see it doesn't hurt performance too.
Anyway you can set isolation mode manually whenever you need like this: http://initd.org/psycopg/docs/extensions.html#isolation-level-constants
To set custom transaction isolation level you can try smth like:
from django.db import connection
with transaction.atomic():
cursor = connection.cursor()
cursor.execute('SET TRANSACTION ISOLATION LEVEL REPEATABLE READ')
# logic
Also I would suggest you to change default mode in settings first (if you can). Then if will fit your needs you can remove it and modify code in special places.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With