I have two servers.
The first (Main) is the server which has installed the script and database.
The second (Remote) is the server which has uploaded files only (just for storage).
Now I'm confused about how to upload the files, and I have 2 ideas to do that, and I don't know if them the best ways or not.
The ideas:
Upload the file by ajax to the main server first and make all security operations like (check size, type, and so on), then upload again to the second server (remote server).
Upload the file by ajax to the second (remote server) directly,
and make all checks and security operations in there, then send the file information to the first server (main) to store the file information into database (Not recommended).
I want to know how behave the big upload sites in store the files that uploaded by users, the idea of how can they upload the files to remote servers ?
Setting up automatic FTP scheduling is as easy as right-clicking on the folder or directory you want to schedule, and clicking Schedule. In the Task Scheduler section you'll be able to name the task, and set a date and time for the transfer to occur.
Right-click the folder and select “Upload other file here. . .“. Browse the server for the file you want to upload. Select the file and click Open. Now, you will see the file in the folder location on the server.
Both mechanisms are valid. In fact we tried both of them in two of our products. Each of them comes with their pros and cons.
Assumption: Main server serves the Web UI.
Lets have the two servers named main.com
and remote.com
.
The main factors that you need to consider are:
Cross-Origin Resource Sharing (CORS): The most straightforward way to check whether this is a big issue is whether you need to support IE <= 9. Legacy IE does not support XHR2 so AJAX file upload is not possible. There are popular fallbacks (e.g. iframe transport) but each comes with their own problems.
Bandwidth: If file is uploaded to main.com
first then to remote.com
, then the required bandwidth is doubled. Depending whether your servers are behind load-balancers, the requirement will vary. Uploading directly to remote.com
uses bandwidth efficiently.
API response time: Unless your API is doing optimistic update, it needs to wait until the file is fully reuploaded to remote.com
before it can respond reliably. Also, this depends on the RTT between main.com
and remote.com
.
Number of API domain: This is a small issue. Web client has to specify which domain to use for which API. However it does relate to CORS issue above.
Performance: If web server is handling file upload, it might have issue with performance when the server is processing large files (or large quantities of file). It might affect other users.
Mechanisms
1. Client uploads to main.com
. main.com
re-uploads to remote.com
2. Client uploads to remote.com
. remote.com
sends info to main.com
3. [EXTRA] Client uploads to main.com
. main.com
streams the file to remote.com
. remote.com
sends file information back.
Conclusion
Depending on your use case, you need to use different mechanism. For our case, we use method 2 (direct upload) for our legacy product because we need to support legacy browsers (IE 7, FF 3). Cross domain issues are stabbing us all the time for many different cases (e.g. when customers are behind proxies, etc.).
We use method 1 for our new product. Bandwidth and response time issues are still okay for normal cases, but when web server and remote server are deployed across continental, the performance is inferior. We have made many optimizations to make it acceptable but it is still worse than method 2.
Method 3 is used by myself in a side project. It is included here because I think it is a good candidate too.
Edit
The difference of streaming (method 3) and re-uploading (method 1) is mainly how the file is stored in main.com
. This impacts resource allocation.
For re-uploading, an uploaded 2GB file is first stored in main.com
, then re-uploaded to remote.com
. main.com
has to allocate resources to temporarily store the file (disk space, memory, CPU for IO). Also, being a serial process, the total time needed to complete the upload to remote.com
is doubled (assuming time to upload to main.com
equals to time to upload to remote.com
).
For streaming, a file being uploaded to main.com
is simultaneously uploaded to remote.com
. Since main.com
uploads a chunk of the file to remote.com
as soon as it received the chunk, the upload processes are overlapped, resulting in shorter upload time (less than double). In another words, if there is no processing needed at main.com
, main.com
is effectively a proxy to remote.com
. Also, since the file is not stored as a whole on main.com
(chunks are normally stored in memory), it does not consume that much resources than re-uploading. However, if main.com
needs to process the file as a whole, then streaming does not bring much benefits.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With