Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django REST framework Group by fields and add extra contents

I have a Ticket booking model

class Movie(models.Model):
    name = models.CharField(max_length=254, unique=True)

class Show(models.Model):
    day = models.ForeignKey(Day)
    time = models.TimeField(choices=CHOICE_TIME)
    movie = models.ForeignKey(Movie)

class MovieTicket(models.Model):
    show = models.ForeignKey(Show)
    user = models.ForeignKey(User)
    booked_at = models.DateTimeField(default=timezone.now)

I would like to filter MovieTicket with its user field and group them according to its show field, and order them by the recent booked time. And respond back with json data using Django REST framework like this:

[
    {
        show: 4,
        movie: "Lion king",
        time: "07:00 pm",
        day: "23 Apr 2017",
        total_tickets = 2
    },
    {
        show: 7,
        movie: "Gone girl",
        time: "02:30 pm",
        day: "23 Apr 2017",
        total_tickets = 1
    }
]

I tried this way:

>>> MovieTicket.objects.filter(user=23).order_by('-booked_at').values('show').annotate(total_tickets=Count('show'))
<QuerySet [{'total_tickets': 1, 'show': 4}, {'total_tickets': 1, 'show': 4}, {'total_tickets': 1, 'show': 7}]>

But its not grouping according to the show. Also how can I add other related fields (i.e., show__movie__name, show__day__date, show__time)

like image 948
Aamu Avatar asked Apr 23 '17 17:04

Aamu


2 Answers

I explain it more generally on the graph of the database model. It can be applied to any "GROUP BY" with an extra contents.

          +-------------------------+
          | MovieTicket (booked_at) |
          +-----+--------------+----+
                |              |
      +---------+--------+  +--+---+
      |    Show (time)   |  | User |
      ++----------------++  +------+
       |                |
+------+-------+  +-----+------+
| Movie (name) |  | Day (date) |
+--------------+  +------------+

The question is: How to summarize MovieTicket (the topmost object) grouped by Show (one related object) filtered by User (other related object) with reporting details from some related deeper objects (Movie and Day) and sorting these results by some field aggregated from the topmost model by the group (by the booked time of the recent MovieTicket in the group):

Answer explained by more general steps:

  • Start with the topmost model:
    (MovieTicket.objects ...)
  • Apply filters:
    .filter(user=user)
  • It is important to group by pk of the nearest related models (at least models those which are not made constant by the filter) - It is only "Show" (because "User" object is still filtered to one user)
    .values('show_id')
    Even if all other fields would be unique together (show__movie__name, show__day__date, show__time) it is better for the database engine optimizer to group the query by show_id because all these other fields depend on show_id and can not impact the number of groups.
  • Annotate necessary aggregation functions:
    .annotate(total_tickets=Count('show'), last_booking=Max('booked_at'))
  • Add required dependent fields:
    .values('show_id', 'show__movie__name', 'show__day__date', 'show__time')
  • Sort what is necessary:
    .order_by('-last_booking') (descending from the latest to the oldest)
    It is very important to not output or sort any field of the topmost model without encapsulating it by aggregation function. (Min and Max functions are good for sampling something from a group. Every field not encapsulated by aggregation would be added to "group by" list and that will break intended groups. More tickets to the same show for friend could be booked gradually but should be counted together and reported by the latest booking.)

Put it together:

from django.db.models import Max

qs = (MovieTicket.objects
      .filter(user=user)
      .values('show_id', 'show__movie__name', 'show__day__date', 'show__time')
      .annotate(total_tickets=Count('show'), last_booking=Max('booked_at'))
      .order_by('-last_booking')
      )

The queryset can be easily converted to JSON how demonstrated zaphod100.10 in his answer, or directly for people not interested in django-rest framework this way:

from collections import OrderedDict
import json

print(json.dumps([
    OrderedDict(
        ('show', x['show_id']),
        ('movie', x['show__movie__name']),
        ('time', x['show__time']),      # add time formatting
        ('day': x['show__day__date']),  # add date formatting
        ('total_tickets', x['total_tickets']),
        # field 'last_booking' is unused
    ) for x in qs
]))

Verify the query:

>>> print(str(qs.query))
SELECT app_movieticket.show_id, app_movie.name, app_day.date, app_show.time,
    COUNT(app_movieticket.show_id) AS total_tickets,
    MAX(app_movieticket.booked_at) AS last_booking
FROM app_movieticket
INNER JOIN app_show ON (app_movieticket.show_id = app_show.id)
INNER JOIN app_movie ON (app_show.movie_id = app_movie.id)
INNER JOIN app_day ON (app_show.day_id = app_day.id)
WHERE app_movieticket.user_id = 23
GROUP BY app_movieticket.show_id, app_movie.name, app_day.date, app_show.time
ORDER BY last_booking DESC

Notes:

  • The graph of models is similar to ManyToMany relationship, but MovieTickets are individual objects and probably hold seat numbers.

  • It would be easy to get a similar report for more users by one query. The field 'user_id' and the name would be added to "values(...)".

  • The related model Day is not intuitive, but it is clear that is has a field date and hopefully also some non trivial fields, maybe important for scheduling shows with respect to events like cinema holidays. It would be useful to set the field 'date' as the primary key of Day model and spare a relationship lookup frequently in many queries like this.

(All important parts of this answer could be found in the oldest two answers: Todor and zaphod100.10. Unfortunately these answers have not been combined together and then not up-voted by anyone except me, even that the question has many up-votes.)

like image 166
hynekcer Avatar answered Sep 19 '22 05:09

hynekcer


I would like to filter MovieTicket with its user field and group them according to its show field, and order them by the recent booked time.

This queryset will give you exactly what you want:

tickets = (MovieTicket.objects
            .filter(user=request.user)
            .values('show')
            .annotate(last_booking=Max('booked_at'))
            .order_by('-last_booking')
)

And respond back with json data using Django rest framework like this: [ { show: 4, movie: "Lion king", time: "07:00 pm", day: "23 Apr 2017", total_tickets = 2 }, { show: 7, movie: "Gone girl", time: "02:30 pm", day: "23 Apr 2017", total_tickets = 1 } ]

Well this json data is not the same as the query you described. You can add total_tickets by extending the annotation and show__movie__name into the .values clause: this will change the grouping to show+movie_name, but since show only has one movie_name it wont matter.

However, you cannot add show__day__date and show__time, because one show have multiple date-times, so which one would you want from a group? You could for example fetch the maximum day and time but this does not guarantee you that at this day+time there will be a show, because these are different fields, not related by each other. So the final attempt may look like:

tickets = (MovieTicket.objects
            .filter(user=request.user)
            .values('show', 'show__movie__name')
            .annotate(
                last_booking=Max('booked_at'),
                total_tickets=Count('pk'),
                last_day=Max('show__day'),
                last_time=Max('show__time'),
            )
            .order_by('-last_booking')
)
like image 24
Todor Avatar answered Sep 18 '22 05:09

Todor