Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid having idle connection timeout while uploading large file?

Consider our current architecture:

         +---------------+                             
         |    Clients    |                             
         |    (API)      |                             
         +-------+-------+                             
                 ∧                                     
                 ∨                                     
         +-------+-------+    +-----------------------+
         | Load Balancer |    |   Nginx               |
         | (AWS - ELB)   +<-->+   (Service Routing)   |
         +---------------+    +-----------------------+
                                          ∧            
                                          ∨            
                              +-----------------------+
                              |   Nginx               |
                              |   (Backend layer)     |
                              +-----------+-----------+
                                          ∧            
                                          ∨            
         -----------------    +-----------+-----------+
           File Storage       |       Gunicorn        |
           (AWS - S3)     <-->+       (Django)        |
         -----------------    +-----------------------+

When a client, mobile or web, try to upload large files (more than a GB) on our servers then often face idle connection timeouts. Either from their client library, on iOS for example, or from our load balancer.

When the file is actually being uploaded by the client, no timeouts occurs because the connection isn't "idle", bytes are being transferred. But I think when the file has been transferred into the Nginx backend layer and Django starts uploading the file to S3, the connection between the client and our server becomes idle until the upload is completed.

Is there a way to prevent this from happening and on which layer should I tackle this issue ?

like image 912
Laurent Jalbert Simard Avatar asked Sep 21 '16 20:09

Laurent Jalbert Simard


People also ask

What is idle connection timeout?

Idle timeout is the maximum length of time that a TCP connection can stay active when no traffic is sent through the connection. The default global idle timeout for all traffic is 3600 seconds (1 hour).

What is idle timeout load balancer?

Connection idle timeoutThe load balancer has a configured idle timeout period that applies to its connections. If no data has been sent or received by the time that the idle timeout period elapses, the load balancer closes the connection.


1 Answers

I have faced the same issue and fixed it by using django-queued-storage on top of django-storages. What django queued storage does is that when a file is received it creates a celery task to upload it to the remote storage such as S3 and in mean time if file is accessed by anyone and it is not yet available on S3 it serves it from local file system. In this way you don't have to wait for the file to be uploaded to S3 in order to send a response back to the client.

As your application behind Load Balancer you might want to use shared file system such as Amazon EFS in order to use the above approach.

like image 181
Aamir Rind Avatar answered Oct 01 '22 11:10

Aamir Rind