Django model count() with caching

Tags:

I have an Django application with Apache Prometheus monitoring and model called Sample.

I want to monitor Sample.objects.count() metric and cache this value for concrete time interval to avoid costly COUNT(*) queries in database.

From this tutorial https://github.com/prometheus/client_python#custom-collectors i read that i need to write custom collector.

What is best approach to achieve this? Is there any way in django to get Sample.objects.count() cached value and update it after K seconds? I also use Redis in my application. Should i store this value there? Should i make separate thread to update Sample.objects.count() cache value?

572

asked Dec 30 '16 13:12

domandinho

1 Answers

First thing to note is that you don't really need to cache the result of a count(*) query.

Though different RDBMS handle count operations differently, they are slow across the board for large tables. But one thing they have in common is that there is an alternative to SELECT COUNT(*) provided by the RDBMS which is in fact a cached result. Well sort of.

You haven't mentioned what your RDBMS is so let's see how it is in the popular ones used wtih Django

mysql

Provided you have a primary key on your table and you are using MyISAM. SELECT COUNT() is really fast on mysql and scales well. But chances are that you are using Innodb. And that's the right storage engine for various reasons. Innodb is transaction aware and can't handle COUNT() as well as MyISAM and the query slows down as the table grows.

the count query on a table with 2M records took 0.2317 seconds. The following query took 0.0015 seconds

Click to copy

SELECT table_rows FROM information_schema.tables 
WHERE table_name='for_count';

but it reported a value of 1997289 instead of 2 million but close enough!

So you don't need your own caching system.

Sqlite

Sqlite COUNT(*) queries aren't really slow but it doesn't scale either. As the table size grows the speed of the count query slows down. Using a table similar to the one used in mysql, SELECT COUNT(*) FROM for_count required 0.042 seconds to complete.

There isn't a short cut. The sqlite_master table does not provide row counts. Neither does pragma table_info

You need your own system to cache the result of SELECT COUNT(*)

Postgresql

Despite being the most feature rich open source RDBMS, postgresql isn't good at handling count(*), it's slow and doesn't scale very well. In other words, no different from the poor relations!

The count query took 0.194 seconds on postgreql. On the other hand the following query took 0.003 seconds.

Click to copy

SELECT reltuples FROM pg_class WHERE relname = 'for_count'

You don't need your own caching system.

SQL Server

The COUNT query on SQL server took 0.160 seconds on average but it fluctuated rather wildly. For all the databases discussed here the first count(*) query was rather slow but the subsequent queries were faster because the file was cached by the operating system.

I am not an expert on SQL server so before answering this question, I didn't know how to look up the row count using schema info. I found this Q&A helpfull. One of them I tried produced the result in 0.004 seconds

Click to copy

SELECT t.name, s.row_count from sys.tables t
JOIN sys.dm_db_partition_stats s
ON t.object_id = s.object_id
AND t.type_desc = 'USER_TABLE'
AND t.name ='for_count'
AND s.index_id = 1

You dont' need your own caching system.

Integrate into Django

As can be seen, all databases considered except sqlite provide a built in 'Cached query count' There isn't a need for us to create one of our own. It's a simple matter of creating a customer manager to make use of this functionality.

Click to copy

class CustomManager(models.Manager):

    def quick_count(self):
        from django.db import connection
        with connection.cursor() as cursor:
            cursor.execute("""SELECT table_rows FROM information_schema.tables 
    WHERE table_name='for_count'""")

         row = cursor.fetchone()
         return row[0]

class Sample(models.Model):
    ....
    objects = CustomManager()

The above example is for postgresql, but the same thing can be used for mysql or sql server by simply changing the query into one of those listed above.

Prometheus

How to plug this into django prometheus? I leave that as an exercise.

196

answered Sep 22 '22 05:09

e4c5

Related questions
                            
                                Is there a way to upload files using the browsable API in Django REST framework?
                            
                                Integrating Ember CLI with Django app
                            
                                DurationField format in a ModelForm
                            
                                Django Error - Reverse for 'password_reset_confirm' with arguments '()' and keyword arguments '
                            
                                Django template: Embed css from file
                            
                                Django Admin - remove permissions from the list on Add/Edit Group page
                            
                                stop django command using sys.exit()
                            
                                Django order by highest number of likes
                            
                                How to use python multiprocessing module in django view
                            
                                Conditional field requirement with DjangoRestFramework serializer
                            
                                How to remove username field in the register form on django admin?
                            
                                Display data from a database in a HTML page using django
                            
                                Should i edit the django migration file to edit mismatched dependencies
                            
                                New Line on Django admin Text Field
                            
                                Django - how to save my hashed password
                            
                                Soft delete django database objects
                            
                                Django just sort ListView by date
                            
                                ValueError: path is on mount 'C:', start on mount 'F:' while django migrations in windows
                            
                                Django image src not found
                            
                                Django-tinymce full featured in admin

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Django model count() with caching

Tags:

database

caching

django

python-2.7

prometheus

domandinho

People also ask

1 Answers

mysql

Sqlite

Postgresql

SQL Server

Integrate into Django

Prometheus

e4c5

Recent Activity

Donate For Us