Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing an input to a service and saving the result to DB in Django

Tags:

python

django

I have a Django application that involves a model with five fields. In the case of one of these fields, I want users to enter a bunch of text, which I then want to submit to a service (via a function call) and save the result. To provide a visual representation:

enter image description here

How does one best play this? One option to me would be to override the save() function, but the types are different - I want the form to show a models.TextField field but save it as the results of a URLField would be saved. Equally, when displayed, I want the user to edit not the URL but rather the text retrieved from that URL.

like image 323
Chris vCB Avatar asked Apr 10 '16 10:04

Chris vCB


People also ask

How does Django save data in data base?

To save changes to an object that's already in the database, use save() . This performs an UPDATE SQL statement behind the scenes. Django doesn't hit the database until you explicitly call save() .

How do I use Django Inspectdb?

Django Inspectdb Refactor will automatically create the required folders and create separate python files for each model. You can install it via pip or to get the latest version clone this repo. Add ``inspectdb_refactor`` to your ``INSTALLED_APPS`` inside settings.py of your project. make models in that particular app.


1 Answers

I think there is no easy standard way to solve your problem. (This is why I don't provide any code, treat this as a long comment instead of a solution. The update provides sources for one of the solutions discussed in this answer.) There are only solutions with pros and cons depending on your circumstances.

Problems:

  1. Asynchronous processing: You are accessing an external service and the request may take some time to complete. For this reason this operation should be done in an asynchronous way (both storing and retrieving data from the external service). The problem with this is that django wasn't really invented for async tasks, there are only hacky async solutions for django.

    • By accessing the external service directly in a non-async way from your django backend your site may hang up for long periods when the external service fails (in worst case with long timeout settings at different points between your server and the external servers). For a not too serious site we can assume that the external service will work in most of the cases and if not, then a little bit of downtime for our servers is acceptable.

    • Doing async processing with a django backend is not only hacky and messy but a really good solution is sometimes nearly impossible.

      On really good solution I mean something that looks like a solution you write with tools/frameworks that have been invented for async servers (golang, gevent). Typical async solutions with django often involve very complex architecture and code compared to a solution with pure golang or gevent server code.

      For example if you have to serve the client some data that you have to retrieve from an external service then if response comes with high latency from the external service then your django backend will have to wait for the response anyway. If you do the waiting in an "asnyc-helper" server (celery, twisted, or gevent) then you may still have to write messy django response handling code with polling, long polling or websockets. The resulting code is super bloated and messy compared to pure async server code. It is better to completely leave out django from this game and solve the problem with communication between the client and your async helper directly. (Update: It is possible to combine gevent+django to make django async-ish but I didn't yet have the opportunity to try it on large scale to find out how solid is that solution and it isn't the way most people use django.)

  2. Server-side validation of the text behind the URL, hiding th URL from the client, etc... You may or may not need these and they can influence which solution you choose.

At Which layer should you put the meat of your solution?

Possible choices:

  1. Client layer
  2. View layer
  3. Form layer
  4. Model layer

1. Client layer

If it isn't a problem that your client knows about the external service and directly communicates with it then this is probably the simplest solution. Implementing async access to the external service is straightforward in javascript. The backend just provides a url that can be read and updated by the client and the rest of the job can be done in javascript.

2. View layer

I think putting the management of this whole thing to the view layer isn't a good idea. People tend to stuff everything to the view for some reason (perhaps because it seems to be the "easiest point of attack" in case of adding new code). In my opinion the view shouldn't do much more than parsing the input from the request for the business logic (and validating it with/without forms) and then formatting the response for the client. Ideally the view contains only glue code that makes use of other facilities (like forms, your own business logic in another module, etc...).

3. Form layer

When you implement validation of client requests the form layer is usually the first place to consider. The job of the form is to transform and validate data between the client and the business logic. A quite often used special case of business logic simply validates the incoming data and saves it to the DB. To simplify this special case ModelForm comes to the rescue and generates a form skeleton from a model that can be customized with some minimal logic that validates the data. However in other cases the client data and the model layout may be different and there can be complex business logic between the form layer and the model layer, in that case ModelForm isn't useful.

Validation in form layer VS view layer

It is better to put validation to forms than into the view because forms are more reusable. Well, you can provide your own reusable validator utility functions/base-classes but still, forms have been provided exactly for this purpose as a specialized standard solution. With forms people implement validation in a more or less standard and controlled way you would expect while in case of putting custom logic into the views and validator utility functions/base classes, people generate a lot of messy code for you to read and these code snippets will always look different. Forms can often be written in a declarative way that lowers the chances of introducing bugs and makes reading code easier because you know what to expect.

Validation in form layer VS model layer

Besides the specialized and declarative nature of forms there is another reason why is it a better layer for validation than the model layer: forms is a higher level layer and validation is also often a high level operation. It may seem a good idea to put fancy validation and tricky logic into your db models' save and pre_save signal until you have to access and/or fix the raw DB directly for some reason. It can also cause a lot of undesired conflicts between your features using models. For practical reasons it is good to have a low level layer that can be accessed without triggering fancy higher level logic.

Your solution as a form field

In case of the from layer you often put your code into a from class to provide a solution that can be reused between several views. However in some special cases it is possible and better to go a bit lower level and provide your solution as a form field when reusability matters. This way your solution will be reusable between from classes that is better than reusability between views. Your problem is a special case that can be solved with a form field. Your form field should provide a textarea for the client and it should convert the incoming client text using pastebin into a url (and vice versa). This way the form field could be used to cooperate with a URL model field. You could even create a specialized URLField that uses your formfield as its default form field when someone generates form from a model with ModelForm.

4. Model layer

Your solution can also be implemented as a custom model field that provides its own less-meaty form field. This is somewhat more complicated than implementing the form field solution and you have to touch lower level layers of django that usually isn't recommended. Your code will be more fragile and sensitive to django version updates. Writing a custom model field can be a real pain in the ass. I've done it once and I didn't like the journey. (When I had to deal with making the field compatible with south migrations... :-P) There is a "Writing custom model fields" page in the django documentation and even that doc states that writing them isn't straightforward and recommends reading the source code of existing fields for inspiration.

In case of the model field solution I think the pastebin model field would be very similar to the standard FileField. Both of these fields basically store the value in an external storage and they store only an id to the database. Instead of writing our completely separate model field we could subclass and specialize FileField. We could replace its default storage with a patebin storage and the associated default form field and widget should also be customized.


Update

We have discussed 3 possible solutions:

  1. Client layer implementation: The backend provides a URLField and the client uses the url to directly communicate with the external service.
  2. Form layer implementation: Using a custom form field that provides a Textarea widget and converts back-and-forth between the URLField of the model and the text area of the frontend by accessing the external service.
  3. Model layer implementation: Providing a FileField-like model field or customizing FileField itself.

The first one is quite straightforward to implement for someone with some javascript knowledge. From the second and third solutions I think the third one is more complex, let's see a possible implementation of it. Key points:

  • I've tried to subclass FileField in a minimally invasive way to make the solution less fragile. The subclassing changes the default storage and the default form field associated with FileField (and adds an optional caching optimization without which the solution would still work).
  • The implementation contains a pastebin-specialized storage for our specialized FileField.
  • The custom form field converts between the text of a textarea and the contents of a pastebin file of our model field.
  • I've tried this code with success using django 1.8 and python 3.5.1 and I don't plan supporting any versions, this code is here just to provide a guideline. Use it at your own risk. If it cuts down your limbs, it is your problem. :-D

Files

pastebin/storage.py:

import io
import requests
from django.core.files import File
from django.core.files.storage import Storage
from django.conf import settings
from django.utils.deconstruct import deconstructible


@deconstructible
class PastebinStorage(Storage):
    def __init__(self, options=None):
        """ The options parameter should be a dict of pastebin parameters for
        the 'create new paste' operation: see http://pastebin.com/api#2
        The most important required option is api_dev_key. Optionally you can
        set api_user_key you want to create non-guest pastes. """
        self.__options = getattr(settings, 'PASTEBIN_STORAGE_OPTIONS', {})
        if options is not None:
            self.__options.update(options)

    @property
    def options(self):
        if 'api_dev_key' not in self.__options:
            raise ValueError('The "api_dev_key" option is missing')
        return self.__options

    def _save(self, name, content):
        # TODO: allow overriding the options on a per-file basis. Maybe we
        # should encode options into the name since we don't use it and
        # we return a completely new name/id at the end of this method.
        data = self.options.copy()
        data.update(
            api_option='paste',
            api_paste_code=content.read(),
        )
        response = requests.post('http://pastebin.com/api/api_post.php', data=data)
        response.raise_for_status()
        # A successful response contains something like: http://pastebin.com/<PASTE_KEY>
        return response.text[response.text.rfind('/')+1:]

    def _open(self, name, mode='rb'):
        if mode != 'rb':
            raise ValueError('Currently the only supported mode is "rb"')

        if 'api_user_key' in self.options:
            content = self._get_user_paste(name)
        else:
            content = self._get_public_paste(name)

        mem_stream = io.StringIO(content)
        mem_stream.name = name
        mem_stream.mode = mode
        return File(mem_stream)

    def _get_user_paste(self, name):
        response = requests.post('http://pastebin.com/api/api_raw.php', data=dict(
            api_dev_key=self.options['api_dev_key'],
            api_user_key=self.options['api_user_key'],
            api_option='show_paste',
            api_paste_key=name,
        ))
        # FIXME: Unfortunately the API seems to return status_code 200
        # also in case of errors with messages like "Bad API request,
        # invalid permission to view this paste or invalid api_paste_key"
        # in the body.
        response.raise_for_status()
        return response.text

    def _get_public_paste(self, name):
        response = requests.get('http://pastebin.com/raw/' + name)
        response.raise_for_status()
        return response.text

    def get_valid_name(self, name):
        return name

    def get_available_name(self, name, max_length=None):
        return name

pastebin/model_field.py:

from django.db.models import FileField
from django.db.models.fields.files import FieldFile

from .storage import PastebinStorage
from .form_field import PastebinFormField


default_storage = PastebinStorage()


# This custom FieldFile implementation is an optional optimization.
class PastebinContentCachingFieldFile(FieldFile):
    def cached_get_pastebin_content(self):
        cached = getattr(self, '_cached_pastebin_content', None)
        if cached and cached[0] == self.name:
            return cached[1]
        with self.storage.open(self.name) as f:
            content = f.read()
        setattr(self, '_cached_pastebin_content', (self.name, content))
        return content


class PastebinModelField(FileField):
    attr_class = PastebinContentCachingFieldFile

    def __init__(self, verbose_name=None, name=None, storage=None, **kwargs):
        storage = storage or default_storage
        super(PastebinModelField, self).__init__(verbose_name, name, storage=storage, **kwargs)

    def formfield(self, **kwargs):
        defaults = {'form_class': PastebinFormField}
        defaults.update(kwargs)
        return super(PastebinModelField, self).formfield(**defaults)

pastebin/form_field.py:

import io

from django.core.files import File
from django.forms import Textarea, CharField


class PastebinFormField(CharField):
    widget = Textarea

    def prepare_value(self, value):
        if value is None or isinstance(value, str):
            return value
        # value is expected to be a PastebinContentCachingFieldFile instance
        return value.cached_get_pastebin_content()

    def to_python(self, data):
        data = super(PastebinFormField, self).to_python(data)
        if data is not None:
            mem_stream = io.StringIO(data)
            mem_stream.name = 'unused'
            mem_stream.mode = 'rb'
            data = File(mem_stream)
        return data

Usage

The storage uses the requests library: pip install requests.

Optionally provide some default pastebin storage settings in the central django settings file. A minimal example can be something like:

PASTEBIN_STORAGE_OPTIONS = {
    'api_dev_key' : '<your_api_dev_key>',
}

If you provide the pastebin config in the django settings file then using the pastebin FileField in your models is as simple as:

class MyModel(models.Model):
    file = PastebinModelField()

If you don't specify anything in django settings or if you want to override the django settings you can do that per-field:

class MyModel(models.Model):
    file = PastebinModelField(storage=PastebinStorage(options=dict(
        api_dev_key='<your_dev_key>',
        api_user_key='<your_user_key>',
        ...
    )))

ModelForm automatically generates a textarea with the contents of the pastebin file for PastebinModelField. Saving with ModelForm creates a new pastebin file for PastebinModelField.

like image 82
pasztorpisti Avatar answered Sep 22 '22 18:09

pasztorpisti