Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is InMemoryUploadedFile really "in memory"?

I understand that opening a file just creates a file handler that takes a fixed memory irrespective of the size of the file. Django has a type called InMemoryUploadedFile that represents files uploaded via forms.

I get the handle to my file object inside the django view like this:

file_object = request.FILES["uploadedfile"]

This file_object has type InMemoryUploadedFile.

Now we can see for ourselves that, file_object has the method .read() which is used to read files into memory.

bytes = file_object.read()

Wasn't file_object of type InMemoryUploadedFile already "in memory"?

like image 208
Pranjal Mittal Avatar asked Dec 29 '13 17:12

Pranjal Mittal


2 Answers

The read() method on a file object is way to access content from within a file object irrespective of whether that file is in memory or stored on the disk. It is similar to other utility file access methods like readlines or seek.

The behavior is similar to what is built into Python which in turn is built over the operating system's fread() method.

Read at most size bytes from the file (less if the read hits EOF before obtaining size bytes). If the size argument is negative or omitted, read all data until EOF is reached. The bytes are returned as a string object. An empty string is returned when EOF is encountered immediately. (For certain files, like ttys, it makes sense to continue reading after an EOF is hit.) Note that this method may call the underlying C function fread() more than once in an effort to acquire as close to size bytes as possible. Also note that when in non-blocking mode, less data than was requested may be returned, even if no size parameter was given.

On the question of where exactly the InMemoryUploadedFile is stored, it is a bit more complicated.

Before you save uploaded files, the data needs to be stored somewhere.

By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold the entire contents of the upload in memory. This means that saving the file involves only a read from memory and a write to disk and thus is very fast.

However, if an uploaded file is too large, Django will write the uploaded file to a temporary file stored in your system’s temporary directory. On a Unix-like platform this means you can expect Django to generate a file called something like /tmp/tmpzfp6I6.upload. If an upload is large enough, you can watch this file grow in size as Django streams the data onto disk.

These specifics – 2.5 megabytes; /tmp; etc. – are simply “reasonable defaults”. Read on for details on how you can customize or completely replace upload behavior.

like image 115
Pratik Mandrekar Avatar answered Nov 09 '22 02:11

Pratik Mandrekar


One thing to consider is that in python file like objects have an API that is pretty strictly adhered to. This allows code to be very flexible, they are abstractions over I/O streams. These allow your code to not have to worry about where the data is coming from, ie. memory, filesystem, network, etc.

File like objects usually define a couple methods, one of which is read

I am not sure of the actually implementation of InMemoryUploadedFile, or how they are generated or where they are stored (I am assuming they are totally in memory though), but you can rest assured that they are file like objects and contain a read method, because they adhere to the file api.

For the implementation you could start checking out the source:

  • https://github.com/django/django/blob/master/django/core/files/uploadedfile.py#L90
  • https://github.com/django/django/blob/master/django/core/files/base.py
  • https://github.com/django/django/blob/master/django/core/files/uploadhandler.py
like image 2
dm03514 Avatar answered Nov 09 '22 03:11

dm03514