I have an ArrayField in my model holding IntegerFields. I'm looking for the best way in which I can extract the smallest and the biggest integer out of this array.
The Django documentation doesn't give examples of something similar and I wonder if the right way would be to find out how and if I can use aggregation function on this ArrayField or if I can cast this to normal python list of integers somehow and use the built-in min, max functions?
Any suggestions for performing this would be helpful. Examples even more.
Python's max and min functions seem to do it.
Model definition:
from django.contrib.postgres.fields import ArrayField
class TestArrayField(models.Model):
integers = ArrayField(
models.IntegerField()
)
Sample data (two TestArrayField
model instances, each with an array of IntegerFields
):
This returns the min/max for a given instance of TestArrayField
:
def test_array_field(request):
# grabbing the first one, you could use .filter() instead
integers_array = TestArrayField.objects.first().integers
max_integer = max(integers_array)
min_integer = min(integers_array)
# printing output..
print(integers_array)
print('max_integer:', max_integer)
print('min_integer:', min_integer)
And here's the output:
[6, 8, 10, 12]
max_integer: 12
min_integer: 6
This is an example from scratch to a working Django with PostgreSQL in reproducible Docker containers, as well as the min/max via various approaches in case there's no reproducible/helpful example found. I just had similar thing on my mind, so I've created it just for fun and feel free to use it if it helps. :)
Starting with a Docker Compose I create 2 containers - one for Django, one for PostgreSQL and set some defaults for it such as password, connectivity, etc as well as mount the current working directory to overlay the filesystem to be able to edit on demand without rebuilding.
Example docker-compose.yml
for testing:
version: "3.7"
services:
my-postgres:
image: postgres:alpine
environment:
POSTGRES_PASSWORD: password
django:
build:
context: "."
dockerfile: Dockerfile
environment:
ENGINE: "django.db.backends.postgresql_psycopg2"
USER: postgres
PASSWORD: password
HOST: "my-postgres"
PORT: 5432
NAME: postgres
volumes:
- .:/tmp
For the client there is necessary to have a compiler (unless you have a different package), headers, shared objects (libraries). For Alpine you also need musl-dev
for limits.h
and other headers + shared objects.
Used Dockerfile
:
FROM python:alpine
RUN apk add postgresql postgresql-dev gcc musl-dev
WORKDIR /tmp
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY . ./
WORKDIR /tmp/app
ENTRYPOINT python manage.py runserver
requirements.txt
asgiref==3.3.1
Django==3.1.6
pytz==2021.1
sqlparse==0.4.1
psycopg2==2.8.6
Tree
in the current directory the django-admin
command was issued, that's why you see app/app
structure which might seem weird. That's because of the naming of the project in django-admin
command as app
.
.
├── app
│ ├── app
│ │ ├── asgi.py
│ │ ├── __init__.py
│ │ ├── settings.py
│ │ ├── urls.py
│ │ └── wsgi.py
│ ├── manage.py
│ └── pg
│ ├── admin.py
│ ├── apps.py
│ ├── __init__.py
│ ├── migrations
│ │ ├── 0001_initial.py
│ │ └── __init__.py
│ ├── models.py
│ ├── tests.py
│ └── views.py
├── docker-compose.yml
├── Dockerfile
└── requirements.txt
How to run:
docker-compose build
docker-compose up -d
docker-compose logs -f # to check the logs
docker-compose exec django python manage.py shell # to play in REPL
docker-compose exec django sh # to play in shell
Modified files:
settings.py
:
# modified
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'pg'
]
from os import environ
DATABASES = {
'default': {
item: environ.get(item)
for item in ["ENGINE", "NAME", "USER", "PASSWORD", "HOST", "PORT"]
}
}
# modified end
pg
application
it is a separate Django application created from scratch in an empty Django project via the created manage.py
file:
django-admin startproject app
cd app
python manage.py startapp pg
# edit models.py before migrations step (of course)
python manage.py makemigrations pg
python manage.py migrate
pg/models.py
:
from django.db import models
from django.contrib.postgres.fields import ArrayField
class MyModel(models.Model):
array = ArrayField(models.IntegerField())
def __str__(self):
return f"<My Model ({self.id})>: {self.array}"
Insert
is the same as if an ordinary Django model:
>>> from pg import models
>>> models.MyModel.objects.all()
<QuerySet []>
>>> models.MyModel.objects.create(array=[1,5,9])
<MyModel: MyModel object (1)>
>>> models.MyModel.objects.create(array=[2,1,0])
<MyModel: MyModel object (2)>
>>> models.MyModel.objects.all()
<QuerySet [<MyModel: <My Model (1)>: [1, 5, 9]>, <MyModel: <My Model (2)>: [2, 1, 0]>]>
Min/Max
can be done in various ways depending on whether you want to leverage Python or the aggregating
>>> # single object min/max, uses Django i.e. your machine
>>> min(models.MyModel.objects.get(id=1).array)
1
>>> max(models.MyModel.objects.get(id=1).array)
9
>>> # sum min/max, same machine
>>> min([sum(item.array) for item in models.MyModel.objects.all()])
3
>>> max([sum(item.array) for item in models.MyModel.objects.all()])
15
Aggregating
>>> from django.db.models import Min, Max
>>> models.MyModel.objects.aggregate(Min("array"))
{'array__min': [1, 5, 9]}
>>> models.MyModel.objects.aggregate(Max("array"))
{'array__max': [2, 1, 0]}
Executing over the array items seems rather problematic through Django itself, but with a little bit of a raw query there's no problem (use min()
for minimum):
from django.db.models.expressions import RawSQL
>>> models.MyModel.objects.annotate(max=RawSQL(
"select (select max(x) from unnest(pg_mymodel.array) x)",
params=[]
)).values("max")
<QuerySet [{'max': 9}, {'max': 2}]>
Even with single items:
>>> models.MyModel.objects.filter(id=2).annotate(max=RawSQL(
"select (select max(x) from unnest(pg_mymodel.array) x)",
params=[]
)).values("max")
<QuerySet [{'max': 2}]>
Depending on your use case it might be cheaper/faster to leverage PostgreSQL to do the heavy lifting (prepare data) or just to pull the raw data without much editing from the DB as "raw" and then e.g. via multiprocessing process the data into a shape you desire leveraging your machine for the heavy computing.
And if you really dislike even raw queries, you can create your own aggregating Expression
. Though sometimes it's just easier/faster to either create a SQL view (can be created via a migration) with pre-made data, build a model for that and pull data via ORM. Use the best tool for the job, but don't spend too much time on it if not required for performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With