Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Django ImageField, and why use it at all?

Up until now, I've been storing my image filenames in a CharField and saving the actual file directly to S3. This was a fine solution for my own usage. I'd like to reconsider using an ImageField, since now there will be other users and file input validation would be appropriate.

I have a couple of questions that weren't exactly answered after reading the docs and the source code for FileField (which appears to be essentially ImageField minus the Pillow check and dimension field updating functionality).

1) Why use an ImageField at all? Or rather, why use a FileField? Sure, it's convenient for quick-and-easy forms and convenient for inserting to Django templates. But are there any substantial reasons, eg. Is it evidently secured against exploits and malicious uploads?

2) How to write to the field file? If it is correct that the file can be read by instance.imagefield (or is it instance.imagefield.file?), if I want to write to it can I simply do the following?

@receiver(pre_save, sender=Image)
def pre_save_image(sender, instance, *args, **kwargs):
    instance.imagefield = process_image(instance.imagefield)

3) How to try saving with a specific filename, then try again with a new filename if that randomly generated filename already exists? For example with my code right now I do this, how can it be done with ImageField? I want to do it at the model layer, because if I do repeated tries at the view layer then the pre_save processing would run again which is ghetto (even though it's unlikely that it'll have a second try ever in the lifetime of the service).

for i in range(tries):
    try:
        name = generate_random_name()
        media_storage.save(name + '.jpg', ContentFile(final_bytes))
        break
    except:
        pass

4) In the models.py pre_save and post_save signals and in the actual model's save(), how can I tell if a file came in with the request? i.e. I want to know if a new image is incoming to be saved, or if there is no image (some other field in the object is being updated and the image itself remains unchanged).

like image 803
davidtgq Avatar asked May 20 '16 01:05

davidtgq


People also ask

What is ImageField Django?

ImageField is a FileField with uploads restricted to image formats only. Before uploading files, one needs to specify a lot of settings so that file is securely saved and can be retrieved in a convenient manner. The default form widget for this field is a ClearableFileInput.

How do I style an ImageField in Django?

from django import forms class ImageForm(forms. Form): image = forms. ImageField( widget = forms. FileInput( attrs = {"id" : "image_field" , # you can access your image field with this id for css styling . , style = "height: 100px ; width : 100px ; " # add style from here . } ) ....

How do I store images in Django?

In Django, a default database is automatically created for you. All you have to do is add the tables called models. The upload_to tells Django to store the photo in a directory called pics under the media directory. The list_display list tells Django admin to display its contents in the admin dashboard.


1 Answers

I don't see any advantage of FileField or ImageField over what you are doing today. In fact, as I see it, the proper/modern/scalable way to deal with uploads is to have the client (browser) upload files directly to S3.

If done correctly (from a security stand point), this scheme allows you to scale in an incredible way without the need to add more computer power on your side. As an example, consider 100 people uploading a picture at the same time. Your server will need to receive all these data, only to upload it again to S3. On the other side, you can have a 1000 people upload at the same time, and I can assure you AWS can handle it. Your server only needs to handle the signing of the URL, which is a lot less work.

Take a look at fine-uploader, as a good technology to use to handle the efficient upload to s3 (loading in chunks, error checking, etc): http://docs.fineuploader.com/endpoint_handlers/amazon-s3.html. Google "django fineuploader" to find a sample application for Django.

In my case, I use a Model with a couple CharFields (bucket, key) plus a few other things specific to my application. My data flow is as follows:

  • Django services a page with the fine-uploader widget, configured based on my settings.
  • Fineuploader requests a signed URL from the django server (endpoint), and uses that to upload to S3 directly.
  • When the upload is complete, fineUploader makes another request to my server to register the completion of the upload, at which time, I create my object on the database. In this case, if the upload fails, I never create an object on the database.
  • On the AWS side, S3 triggers a Lambda function, which I use to create a thumbnail, and store it back to S3. So, I don't even use my own CPU (e.g. Celery) for resizing. So you see, not only can I have thousands of users uploading at the same time, but I can resize those thousand pictures in parallel, and for less than what an EC2 worker will cost me.
  • My Django Model is also used as a wrapper to manage the business logic (e.g. functions like get_original_url() and get_thumbnail_url()), so after the uploads, it is easy for my templates to get the signed read-onlly URLs.

In short, you can implement your own version of Fineuploader if you want, or use many of the alternative, but assuming you follow the recommended security best practices on the AWS side (e.g. create a special IAM with only write permission for the client, even if you are using signed URLs), this, IMO, is the best practice for dealing with uploads, especially if you are using S3 or similar to store these files.

Sorry if I am only really answering question 1, but questions 2 and 3 don't apply if you accept my answer for 1.

like image 136
dkarchmer Avatar answered Sep 20 '22 23:09

dkarchmer