Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django .order_by() with .distinct() using postgres

I have a Read model that is related to an Article model. What I would like to do is make a queryset where articles are unique and ordered by date_added. Since I'm using postgres, I'd prefer to use the .distinct() method and specify the article field. Like so:

articles = Read.objects.order_by('article', 'date_added').distinct('article')

However this doesn't give the desired effect and orders the queryset by the order they were created. I am aware of the note about .distinct() and .order_by() in Django's documentation, but I don't see that it applies here since the side effect it mentions is there will be duplicates and I'm not seeing that.

# To actually sort by date added I end up doing this
articles = sorted(articles, key=lambda x: x.date_added, reverse=True)

This executes the entire query before I actually need it and could potentially get very slow if there are lots of records. I've already optimized using select_related().

Is there a better, more efficient, way to create a query with uniqueness of a related model and order_by date?

UPDATE The output would ideally be a queryset of Read instances where their related article is unique in the queryset and only using the Django orm (i.e. sorting in python).

like image 280
ender Avatar asked May 06 '26 04:05

ender


1 Answers

Is there a better, more efficient, way to create a query with uniqueness of a related model and order_by date?

Possibily. It's hard to say without the full picture, but my assumption is that you are using Read to track which articles have and have not been read, and probably tying this to User instance to determine if a particular user has read an article or not. If that's the case, your approach is flawed. Instead, you should do something like:

class Article(models.Model):
    ...
    read_by = models.ManyToManyField(User, related_name='read_articles')

Then, to get a particular user's read articles, you can just do:

user_instance.read_articles.order_by('date_added')

That takes the need to use distinct out of the equation, since there will not be any duplicates now.

UPDATE

To get all articles that are read by at least one user:

Article.objects.filter(read_by__isnull=False)

Or, if you want to set a threshold for popularity, you can use annotations:

from django.db.models import Count

Article.objects.annotate(read_count=Count('read_by')).filter(read_count__gte=10)

Which would give you only articles that have been read by at least 10 users.

like image 142
Chris Pratt Avatar answered May 09 '26 01:05

Chris Pratt



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!