Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django - Filter queryset by CharField value length

Given that I have a legacy model with a CharField or CharField-based model field like:

class MyModel(models.Model):
    name = models.CharField(max_length=1024, ...)
    ...

I need to make migrations to make it have a max_length of max. 255. First I'm writing a datamigration to make any values with longer than 255 characters adapt the upcoming schemamigration to fix the maximum length of the column, which I will do right after this works.

The issue is that I have a very very large set of data and I know not all the rows contain a value longer than 255 chars for MyModel.name and I would like to consider for my migration only those who do.

Is there any way (with) the django ORM to filter only the objects that meet this condition? Something like:

MyModel.objects.filter(name__len__gte=255)

would be great, but I believe this is not possible, or at least it's not this straightforward.

Does somebody know any way to accomplish this query?

Thanks!

like image 885
Gerard Avatar asked Apr 28 '14 21:04

Gerard


2 Answers

'Recent' Django versions have the built in django.db.models.functions.Length lookup/transform, so you can do:

MyModel.objects.annotate(length=Length('text')).filter(length__gt=254)

See https://docs.djangoproject.com/en/1.11/ref/models/database-functions/#length

Old answer:

I think you have to options:

Using 'extra' on the queryset:

MyModel.objects.extra(where=["CHAR_LENGTH(text) > 254"])

Or abusing Regex lookups, I'm assuming this will be slower:

MyModel.objects.filter(text__regex = r'^.{254}.*')
like image 171
BBT Avatar answered Sep 30 '22 16:09

BBT


If you find that you are using a lot of extra and regex, following @BBT's suggestions, I went ahead and implemented the transform as follows:

# utils.db

from django.db.models import Transform
from django.db.models import CharField

class CharacterLength(Transform):
    lookup_name = 'len'
    def as_sql(self, compiler, connection):
        lhs, params = compiler.compile(self.lhs)
        return "LENGTH(%s)" % lhs, params

CharField.register_lookup(CharacterLength)

Thereafter, I could do a cascaded lookup on "mycolname" as follows:

from utils.db import *

queryset.filter(mycolname__len__gte=10)
like image 35
Lord Mosuma Avatar answered Sep 30 '22 18:09

Lord Mosuma