I've got this model:
class Visit(models.Model): timestamp = models.DateTimeField(editable=False) ip_address = models.IPAddressField(editable=False)
If a user visits multiple times in one day, how can I filter for unique rows based on the ip field? (I want the unique visits for today)
today = datetime.datetime.today() yesterday = datetime.datetime.today() - datetime.timedelta(days=1) visits = Visit.objects.filter(timestamp__range=(yesterday, today)) #.something?
EDIT:
I see that I can use:
Visit.objects.filter(timestamp__range=(yesterday, today)).values('ip_address')
to get a ValuesQuerySet of just the ip fields. Now my QuerySet looks like this:
[{'ip_address': u'127.0.0.1'}, {'ip_address': u'127.0.0.1'}, {'ip_address': u'127.0.0.1'}, {'ip_address': u'127.0.0.1'}, {'ip_address': u'127.0.0.1'}]
How do I filter this for uniqueness without evaluating the QuerySet and taking the db hit?
# Hope it's something like this... values.distinct().count()
If you want to get distinct objects, instead of values, then remove flat=True from the above query, and use values() instead of values_list(). In the above code, we add distinct() at the end of queryset to get distinct values.
Yes, you can reuse existing querysets. This is not really making anything faster though, in fact, this code block won't even execute a query against the database because Django QuerySets are lazily evaluated. What I means is that it won't send the query to the database until you actually need the values.
A QuerySet is a collection of data from a database. A QuerySet is built up as a list of objects. QuerySets makes it easier to get the data you actually need, by allowing you to filter and order the data.
The SQL SELECT DISTINCT Statement The SELECT DISTINCT statement is used to return only distinct (different) values. Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.
What you want is:
Visit.objects.filter(stuff).values("ip_address").annotate(n=models.Count("pk"))
What this does is get all ip_addresses and then it gets the count of primary keys (aka number of rows) for each ip address.
With Alex Answer I also have the n:1 for each item. Even with a distinct() clause.
It's weird because this is returning the good numbers of items :
Visit.objects.filter(stuff).values("ip_address").distinct().count()
But when I iterate over "Visit.objects.filter(stuff).values("ip_address").distinct()" I got much more items and some duplicates...
EDIT :
The filter clause was causing me troubles. I was filtering with another table field and a SQL JOIN was made that was breaking the distinct stuff. I used this hint to see the query that was really used :
q=Visit.objects.filter(myothertable__field=x).values("ip_address").distinct().count() print q.query
I then reverted the class on witch I was making the query and the filter to have a join that doesn't rely on any "Visit" id.
hope this helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With