django has this complex ORM built in to it, but after spending much time on it, it is still hard for me to make queries that are remarkably simple in SQL. There are even some simple things that I can't find a way to do through the django ORM (e.g. 'select distinct column1 from tablename').
Is there any documentation that shows "For common SQL statements, here is how you do it in django"?
(I did try google first, but either it isn't out there or I just can't think of the right query...)
There are some things that are ridiculously simple in SQL that are difficult or impossible through an ORM. This is called the "object-relational impedance mismatch." Essentially an ORM treats each row in a database as a separate object. So operations that involve treating values separately from their row become fairly challenging. Recent versions of Django (1.1+) improve this situation somewhat with aggregation support, but for many things, only SQL will work.
To this end, django provides several methods of letting you drop down into raw sql quite simply. Some of them return model objects as results, while others take you all the way down to your DBAPI2 connector. The most low level looks like this:
from django.db import connection
cursor = connection.cursor()
cursor.execute("SELECT DISTINCT column1 FROM tablename")
row = cursor.fetchone()
If you want to return a queryset from a SQL query, use the raw() on your model's manager:
qs = ModelName.objects.raw("""SELECT first_name
FROM myapp_modelname
WHERE last_name = 'van Rossum'")
for person in qs:
print person.first_name # Result already available
print person.last_name # Has to hit the DB again
Note: raw() is only available in the development version of Django, which should be merged into trunk as of 1.2.
Complete information is available in the documentation under Performing raw SQL queries.
Think of it this way.
"For common SQL hack-arounds, what was the object-oriented thing I was supposed to be doing in the first place?"
The issue isn't that the ORM is complex. It's that your brain has been warped in the SQL mold, making it hard to see the objects clearly.
General rules:
If you think it's a simple SELECT FROM WHERE, stop. Ask what objects you needed to see in the result set. Then find those objects and work with the object manager.
If you think it's a simple JOIN, stop. Ask what primary object you want. Remember, objects don't use foreign keys. Join doesn't mean anything. An object seem to break 1NF and contain and entire set of related objects within it. Then find the "primary" objects and work with the object manager. Use the related objects queries to find related objects.
If you think it's an OUTER JOIN, stop. Ask what two things you want to see in the result set. An outer join is things which will join UNIONED with things that won't join. What are the things in the first place. Then find the "primary" objects and work with the object manager. Some will have sets of related objects. Some won't.
If you think it's a WHERE EXISTS or or WHERE IN with a subquery, your model is probably incomplete. Sometimes, it requires a fancy join. But if you're doing this kind of checking, it usually means you need a property in your model.
If you think you need SELECT DISTINCT, you've missed the boat entirely. That's just a Python set. You simply get the column values into a Python set. Those are the distinct values.
If you think you need a GROUP BY, you're ignoring Python collections.defaultdict
. Using Python to to GROUP BY is usually faster than fussing around with SQL.
Except for data warehousing. Which you shouldn't be doing in Django. You have to use SQLAlchemy for data warehousing.
A good starting point for doing Django queries is the Django docs themselves.
http://docs.djangoproject.com/en/dev/topics/db/queries/
Here are a few examples:
select * from table
=
ModelName.objects.all()
filtering:
select * from table where column = 'foo'
=
ModelName.objects.filter(column='foo')
Specifically regarding the use of distinct you use the distinct() method of a Django queryset.
Here's the relevant link in the documentation. http://docs.djangoproject.com/en/dev/ref/models/querysets/#distinct
Update: The ORM helps you by allowing you to use Object Oriented interactions with your data. You don't write code that translates the resultset of your query into a set of objects. It does it automatically. That's the fundamental change in thought process you have to make.
You start to think in terms of 'I have this object, I need to get all the other objects which are like it' Then you can ask the ORM for those objects. ORM, I need all the objects of Class Product which have an attribute of color "blue"
Django's specific ORM language for that is:
products = Product.objects.filter(color='blue')
This is done instead of:
That's the value in using an ORM. Code simplification and reduced development time.
For your specific how-to, you'd do it like this:
MyModel.objects.values_list('column1', flat=True).distinct()
But other posters are correct to say you shouldn't be thinking 'how do I write this SQL in the ORM'. When you learned Python, coming from Java or C++ or whatever, you soon learned to get out of the mindset of 'how do I write this Java code in Python', and just concentrated on solving the problem using Python. The same should be true of using the ORM.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With