Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django Custom Field: Only run to_python() on values from DB?

How can I ensure that my custom field's *to_python()* method is only called when the data in the field has been loaded from the DB?

I'm trying to use a Custom Field to handle the Base64 Encoding/Decoding of a single model property. Everything appeared to be working correctly until I instantiated a new instance of the model and set this property with its plaintext value...at that point, Django tried to decode the field but failed because it was plaintext.

The allure of the Custom Field implementation was that I thought I could handle 100% of the encoding/decoding logic there, so that no other part of my code ever needed to know about it. What am I doing wrong?

(NOTE: This is just an example to illustrate my problem, I don't need advice on how I should or should not be using Base64 Encoding)

def encode(value):
    return base64.b64encode(value)

def decode(value):
    return base64.b64decode(value)


class EncodedField(models.CharField):
    __metaclass__ = models.SubfieldBase

    def __init__(self, max_length, *args, **kwargs):
        super(EncodedField, self).__init__(*args, **kwargs)

    def get_prep_value(self, value):
        return encode(value)

    def to_python(self, value):
        return decode(value)

class Person(models.Model):
    internal_id = EncodedField(max_length=32)

...and it breaks when I do this in the interactive shell. Why is it calling to_python() here?

>>> from myapp.models import *
>>> Person(internal_id="foo")
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python2.6/dist-packages/django/db/models/base.py", line 330, in __init__
    setattr(self, field.attname, val)
  File "/usr/local/lib/python2.6/dist-packages/django/db/models/fields/subclassing.py", line 98, in __set__
    obj.__dict__[self.field.name] = self.field.to_python(value)
  File "../myapp/models.py", line 87, in to_python
    return decode(value)
  File "../myapp/models.py", line 74, in decode
    return base64.b64decode(value)
  File "/usr/lib/python2.6/base64.py", line 76, in b64decode
    raise TypeError(msg)
TypeError: Incorrect padding

I had expected I would be able to do something like this...

>>> from myapp.models import *
>>> obj = Person(internal_id="foo")
>>> obj.internal_id
'foo'
>>> obj.save()
>>> newObj = Person.objects.get(internal_id="foo")
>>> newObj.internal_id
'foo'
>>> newObj.internal_id = "bar"
>>> newObj.internal_id
'bar'
>>> newObj.save()

...what am I doing wrong?

like image 805
Adam Levy Avatar asked Dec 22 '10 14:12

Adam Levy


2 Answers

(from http://davidcramer.posterous.com/code/181/custom-fields-in-django.html
and https://docs.djangoproject.com/en/dev/howto/custom-model-fields/#converting-database-values-to-python-objects)

It seems that you need to be able to test if it is an instance and the problem with that is they are the same type (string vs b64 encoded string).So unless you can detirmine the difference I would suggest making sure you always:

Person(internal_id="foo".encode('base64', 'strict'))

or

Person(internal_id=base64.b64encod("foo"))

or some such encoding.

EDIT:- i was looking at https://github.com/django-extensions/django-extensions/blob/f278a9d91501933c7d51dffc2ec30341a1872a18/django_extensions/db/fields/encrypted.py and thought you could do something similar.

like image 57
James Khoury Avatar answered Nov 20 '22 09:11

James Khoury


I have the exact same problem, but with JSON data. I want to store data in the database in JSON format. However if you try to store an already serialized JSON object, it will be returned deserialized. So the issue is that what comes in, is not always what comes out. Especially if you try to store a number as a string, it will be returned as an int or float, since it is deserialized by to_python before being stored.

The solution is simple, albeit not too elegant. Simply make sure to store the "type" of data along with the data, in my case, it is JSON data, so I prefix it with "json:", and thus you always know if the data is coming from the database.

def get_prep_value(self, value):
    if value is not None:
        value = "json:"+json.dumps(value)
    return value
def to_python(self, value):
    if type(value) in (str, unicode) and value.startswith("json:"):
        value = value[5:]
        try:
            return json.loads(value)
        except:
            # If an error was raised, just return the plain value
            return value
    else:
        return value

That being said, it's annoying that you can't expect consistent behavior, or that you can't tell whether to_python is operating on a user set value or a value from the DB.

like image 3
zeraien Avatar answered Nov 20 '22 09:11

zeraien