Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query all rows and return most recent of each duplicate

I have a model that has an id that isn't unique. Each model also has a date. I would like to return all results but only the most recent of each row that shares ids. The model looks something like this:

class MyModel(models.Model):
    my_id = models.PositiveIntegerField()
    date  = models.DateTimeField()
    title = models.CharField(max_length=36)


## Add some entries
m1 = MyModel(my_id=1, date=yesterday, title='stop')
m1.save()

m2 = MyModel(my_id=1, date=today, title='go')
m2.save()

m3 = MyModel(my_id=2, date=today, title='hello')
m3.save()

Now try to retrieve these results:

MyModel.objects.all()... # then limit duplicate my_id's by most recent

Results should be only m2 and m3

like image 849
Scott Avatar asked Jan 22 '23 05:01

Scott


1 Answers

You won't be able to do this with just the ORM, you'll need to get all the records, and then discard the duplicates in Python.

For example:

objs = MyModel.objects.all().order_by("-date")
seen = set()
keep = []
for o in objs:
    if o.id not in seen:
        keep.append(o)
        seen.add(o.id)

Here's some custom SQL that can get what you want from the database:

select * from mymodel where (id, date) in (select id, max(date) from mymodel group by id)

You should be able to adapt this to use in the ORM.

like image 112
Ned Batchelder Avatar answered Feb 06 '23 22:02

Ned Batchelder