Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I force Django to ignore any caches and reload data?

I'm using the Django database models from a process that's not called from an HTTP request. The process is supposed to poll for new data every few seconds and do some processing on it. I have a loop that sleeps for a few seconds and then gets all unhandled data from the database.

What I'm seeing is that after the first fetch, the process never sees any new data. I ran a few tests and it looks like Django is caching results, even though I'm building new QuerySets every time. To verify this, I did this from a Python shell:

>>> MyModel.objects.count() 885 # (Here I added some more data from another process.) >>> MyModel.objects.count() 885 >>> MyModel.objects.update() 0 >>> MyModel.objects.count() 1025 

As you can see, adding new data doesn't change the result count. However, calling the manager's update() method seems to fix the problem.

I can't find any documentation on that update() method and have no idea what other bad things it might do.

My question is, why am I seeing this caching behavior, which contradicts what Django docs say? And how do I prevent it from happening?

like image 811
scippy Avatar asked Jul 27 '10 17:07

scippy


People also ask

Does Django cache by default?

Django has a default caching system in the form of local-memory caching. It is very powerful and robust. This system can handle multi-threaded processes and is efficient. It is best for those projects which cannot use Memcached framework.

How does Django handle cache and memory management?

To use cache in Django, first thing to do is to set up where the cache will stay. The cache framework offers different possibilities - cache can be saved in database, on file system or directly in memory. Setting is done in the settings.py file of your project.

Why Django Querysets are lazy?

This is because a Django QuerySet is a lazy object. It contains all of the information it needs to populate itself from the database, but will not actually do so until the information is needed.


2 Answers

Having had this problem and found two definitive solutions for it I thought it worth posting another answer.

This is a problem with MySQL's default transaction mode. Django opens a transaction at the start, which means that by default you won't see changes made in the database.

Demonstrate like this

Run a django shell in terminal 1

>>> MyModel.objects.get(id=1).my_field u'old' 

And another in terminal 2

>>> MyModel.objects.get(id=1).my_field u'old' >>> a = MyModel.objects.get(id=1) >>> a.my_field = "NEW" >>> a.save() >>> MyModel.objects.get(id=1).my_field u'NEW' >>>  

Back to terminal 1 to demonstrate the problem - we still read the old value from the database.

>>> MyModel.objects.get(id=1).my_field u'old' 

Now in terminal 1 demonstrate the solution

>>> from django.db import transaction >>>  >>> @transaction.commit_manually ... def flush_transaction(): ...     transaction.commit() ...  >>> MyModel.objects.get(id=1).my_field u'old' >>> flush_transaction() >>> MyModel.objects.get(id=1).my_field u'NEW' >>>  

The new data is now read

Here is that code in an easy to paste block with docstring

from django.db import transaction  @transaction.commit_manually def flush_transaction():     """     Flush the current transaction so we don't read stale data      Use in long running processes to make sure fresh data is read from     the database.  This is a problem with MySQL and the default     transaction mode.  You can fix it by setting     "transaction-isolation = READ-COMMITTED" in my.cnf or by calling     this function at the appropriate moment     """     transaction.commit() 

The alternative solution is to change my.cnf for MySQL to change the default transaction mode

transaction-isolation = READ-COMMITTED 

Note that that is a relatively new feature for Mysql and has some consequences for binary logging / slaving. You could also put this in the django connection preamble if you wanted.

Update 3 years later

Now that Django 1.6 has turned on autocommit in MySQL this is no longer a problem. The example above now works fine without the flush_transaction() code whether your MySQL is in REPEATABLE-READ (the default) or READ-COMMITTED transaction isolation mode.

What was happening in previous versions of Django which ran in non autocommit mode was that the first select statement opened a transaction. Since MySQL's default mode is REPEATABLE-READ this means that no updates to the database will be read by subsequent select statements - hence the need for the flush_transaction() code above which stops the transaction and starts a new one.

There are still reasons why you might want to use READ-COMMITTED transaction isolation though. If you were to put terminal 1 in a transaction and you wanted to see the writes from the terminal 2 you would need READ-COMMITTED.

The flush_transaction() code now produces a deprecation warning in Django 1.6 so I recommend you remove it.

like image 147
Nick Craig-Wood Avatar answered Oct 08 '22 09:10

Nick Craig-Wood


We've struggled a fair bit with forcing django to refresh the "cache" - which it turns out wasn't really a cache at all but an artifact due to transactions. This might not apply to your example, but certainly in django views, by default, there's an implicit call to a transaction, which mysql then isolates from any changes that happen from other processes ater you start.

we used the @transaction.commit_manually decorator and calls to transaction.commit() just before every occasion where you need up-to-date info.

As I say, this definitely applies to views, not sure whether it would apply to django code not being run inside a view.

detailed info here:

http://devblog.resolversystems.com/?p=439

like image 36
hwjp Avatar answered Oct 08 '22 10:10

hwjp