Assuming I have two models:
class Profile(models.Model):
#some fields here
class Ratings(models.Model):
profile = models.ForeignKey(profile)
category = models.IntegerField()
points = models.IntegerField()
Assuming following examle of MySQL table "ratings":
profile | category | points
1 1 10
1 1 4
1 2 10
1 3 0
1 4 10
1 4 10
1 4 10
1 5 0
I have following values in my POST data and also other fields values:
category_1_avg_val = 7
category_2_avg_val = 5
category_3_avg_val = 5
category_4_avg_val = 7
category_5_avg_val = 9
I want to filter profiles that have the average ratings calculated for categories higher or equal to required values.
Some filters are applied initially as:
q1 = [('associated_with', search_for),
('profile_type__slug__exact', profile_type),
('gender__in', gender),
('rank__in', rank),
('styles__style__in', styles),
('age__gte', age_from),
('age__lte', age_to)]
q1_list = [Q(x) for x in q1 if x[1]]
q2 = [('user__first_name__icontains', search_term),
('user__last_name__icontains', search_term),
('profile_type__name__icontains', search_term),
('styles__style__icontains', search_term),
('rank__icontains', search_term)]
q2_list = [Q(x) for x in q2 if x[1]]
if q1_list:
objects = Profile.objects.filter(
reduce(operator.and_, q1_list))
if q2_list:
if objects:
objects = objects.filter(
reduce(operator.or_, q2_list))
else:
objects = Profile.objects.filter(
reduce(operator.or_, q2_list))
if order_by_ranking_level == 'desc':
objects = objects.order_by('-ranking_level').distinct()
else:
objects = objects.order_by('ranking_level').distinct()
Now i want to filter profiles whose (average of points) (group by category) >= (avg values of category coming in post)
I tried to do this one by one as
objects = objects.filter(
ratings__category=1) \
.annotate(avg_points=Avg('ratings__points'))\
.filter(avg_points__gte=category_1_avg_val)
objects = objects.filter(
ratings__category=2) \
.annotate(avg_points=Avg('ratings__points'))\
.filter(avg_points__gte=category_2_avg_val)
But this is wrong I think. Please help me out. If return is a queryset that would be great.
Edited
Using the answer posted by hynekcer
I came up with slightly different solution as I have already queryset of profiles which needs to be filtered more based on rating.
def check_ratings_avg(pr, rtd):
ok = True
qr = Ratings.objects.filter(profile__id=pr.id) \
.values('category')\
.annotate(points_avg=Avg('points'))
qr = {i['category']:i['points_avg'] for i in qr}
for cat in rtd:
val = rtd[cat]
if qr[cat] >= val:
pass
else:
ok = False
break
return ok
rtd = {1: category_1_avg_val, 2: category_2_avg_val, 3: category_3_avg_val,
4: category_4_avg_val, 5: category_5_avg_val}
objects = [i for i in objects if check_ratings_avg(i, rtd)]
Your complex query require a subquery in the principle. Possible solutions are:
A) Subquery emulated by Python
from django.db.models import Q, Avg
from itertools import groupby
from myapp.models import Profile, Ratings
def iterator_filtered_by_average(dictionary):
qr = Ratings.objects.values('profile', 'category', 'points').order_by(
'profile', 'category').annotate(points_avg=Avg('points'))
f = Q()
for k, v in dictionary.iteritems():
f |= Q(category=k, points_avg__gte=v)
for profile, grp in groupby(qr.filter(f).values('profile')):
if len(list(grp)) == len(dictionary):
yield profile
#example
FILTER_DATA = {1:category_1_avg_val, 2:category_2_avg_val, 3:category_3_avg_val,
4:category_4_avg_val, 5:category_5_avg_val}
for row in iterator_filtered_by_average(FILTER_DATA):
print row
This is a simple solution for the original question without later additional requirements.
B) Solution with subqueries:
It is necessary for the more detailed version of question because if the initial filters are based on some field of type ManyToManyField
and also because it contains a distinct
clause:
# objects: QuerySet that you get from your initial filters. Not yet executed.
if rtd:
# Method `as_nested_sql` removes the `order_by` clase, unlike `as_sql`
subquery3 = objects.values('id').query \
.get_compiler(connection=connection).as_nested_sql()
subquery2 = ("""SELECT profile_id, category, avg(points) AS points_avg
FROM myapp_ratings
WHERE profile_id in
( %s
) GROUP BY profile_id, category
""" % subquery3[0], subquery3[1]
)
where_sql = ' OR '.join(
'category = %d AND points_avg >= %%s' % cat for cat in rtd.keys()
)
subquery = (
"""SELECT profile_id
FROM
( %s
) subquery2
WHERE %s
GROUP BY profile_id
HAVING count(*) = %s
""" % (subquery2[0], where_sql, len(rtd)),
subquery2[1] + tuple(rtd.values())
)
assert order_by_ranking_level in ('asc', 'desc')
mainquery = ("""SELECT myapp_profile.* FROM myapp_profile
INNER JOIN
( %s
) subquery ON subquery.profile_id=myapp_profile.id
ORDER BY ranking_level %s"""
% (subquery[0], order_by_ranking_level), subquery[1]
)
objects = Profile.objects.raw(mainquery[0], params=mainquery[1])
return objects
Replace please all strings myapp
by name_of_your_application
.
Example of SQL generated by this code
SELECT myapp_profile.* FROM myapp_profile
INNER JOIN
( SELECT profile_id
FROM
( SELECT profile_id, category, avg(points) AS points_avg
FROM myapp_ratings
WHERE profile_id IN
( SELECT U0.`id` FROM `myapp_profile` U0 WHERE U0.`ranking_level` >= 4
) GROUP BY profile_id, category
) subquery2
WHERE category = 1 AND points_avg >= 7 OR category = 2 AND points_avg >= 5
OR category = 3 AND points_avg >= 5 OR category = 4 AND points_avg >= 7
OR category = 5 AND points_avg >= 9
GROUP BY profile_id
HAVING count(*) = 5
) subquery ON subquery.profile_id=myapp_profile.id
ORDER BY ranking_level asc
(This SQL is for better readability parsed manually with strings %s
replaced by parameters, however the database engine receive parameters unparsed for security reasons.)
Your problem is due to little support of subqueries generated by Django. Only examples from documentation of more complicated queries create a subquery. (e.g. aggregate
after annotate
or count
after annotate
or aggregate
after distinct
, but no annotate
after distinct
or after annotate
) Complicated nested aggregations are simplified to one query which is unexpected.
All other solutions that execute a new individual SQL query for every object filtered by the first query are discouraged for production although they can be very useful for testing results of any better solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With