Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting Django QuerySet to pandas DataFrame

I am going to convert a Django QuerySet to a pandas DataFrame as follows:

qs = SomeModel.objects.select_related().filter(date__year=2012) q = qs.values('date', 'OtherField') df = pd.DataFrame.from_records(q) 

It works, but is there a more efficient way?

like image 497
Franco Mariluis Avatar asked Jul 28 '12 02:07

Franco Mariluis


People also ask

Is pandas faster than Django ORM?

Well, Pandas is more performant because it's operating on data in memory. Django ORM is not as performant because it's operating on data that's loaded in over a network. With that said, although they are both operating on data, they solve different problems and could even be used in a complementary way.

How does QuerySet work in Django?

A QuerySet represents a collection of objects from your database. It can have zero, one or many filters. Filters narrow down the query results based on the given parameters. In SQL terms, a QuerySet equates to a SELECT statement, and a filter is a limiting clause such as WHERE or LIMIT .

Can we use pandas in Django?

In this tutorial, you will learn how to use pandas in Django data. And convert a query set of data into a Data frame. Like how you convert a CSV data file into a Data Frame. And perform the data science operation right away in Django Views.

What is Django filter?

Django-filter is a generic, reusable application to alleviate writing some of the more mundane bits of view code. Specifically, it allows users to filter down a queryset based on a model's fields, displaying the form to let them do this. Adding a FilterSet with filterset_class. Using the filterset_fields shortcut.


2 Answers

import pandas as pd import datetime from myapp.models import BlogPost  df = pd.DataFrame(list(BlogPost.objects.all().values())) df = pd.DataFrame(list(BlogPost.objects.filter(date__gte=datetime.datetime(2012, 5, 1)).values()))  # limit which fields df = pd.DataFrame(list(BlogPost.objects.all().values('author', 'date', 'slug'))) 

The above is how I do the same thing. The most useful addition is specifying which fields you are interested in. If it's only a subset of the available fields you are interested in, then this would give a performance boost I imagine.

like image 62
lexual Avatar answered Oct 04 '22 17:10

lexual


Django Pandas solves this rather neatly: https://github.com/chrisdev/django-pandas/

From the README:

class MyModel(models.Model):     full_name = models.CharField(max_length=25)     age = models.IntegerField()     department = models.CharField(max_length=3)     wage = models.FloatField()  from django_pandas.io import read_frame qs = MyModel.objects.all() df = read_frame(qs) 
like image 36
David Watson Avatar answered Oct 04 '22 16:10

David Watson