Does the HTTP/WebDav spec allow this client-server dialog?
Note: This PUT is an initial upload. It is not an update.
If this is possible, a way faster file syncing could be implemented.
Use case: The WebDAV server hosts a directory for each user. The favorite video foo.mkv gets uploaded by several users. In this example the favorite video is already stored at this location: /user2/myfoo.mkv. The second and following uploads don't need to send any data, since the server already knows the content. This would reduce a lot of network load.
Preconditions:
It would be very easy to implement this in a custom client and server. But that's not what I want.
My question: Is there an RFC or other standard that allows such a dialog?
If there is no standard yet, then how to proceed to get this dream come true?
Security consideration
With the above dialog it would be able to access the content of know hashes. Example an evil client knows that there is a file with the hash sum of 1234567...
. He could do the above two steps and after that the client could use a GET to download the data.
A way around this to extend the dialog:
abcde...
How to get this done?
Since it seems that there is not spec yet, this part of the question remains:
How to proceed to get this dream come true?
From what you described, it seems like ETags should be used.
It was specifically designed to associate a tag (usually an MD5 hash, but can be anything) with a resource's content (and/or location) so you can later tell whether the resource has changed or not.
PUT requests are supported by ETags and are commonly used with the If-Match
header for optimistic concurrency control.
However, your use case is slightly different as you are trying to prevent a PUT to a resource with the same content, whereas the If-Match
header is used to only allow the PUT to a resource with the same content.
In your case, you can instead use the If-None-Match
header:
The meaning of "If-None-Match: *" is that the method MUST NOT be performed if the representation selected by the origin server (or by a cache, possibly using the Vary mechanism, see section 14.44) exists, and SHOULD be performed if the representation does not exist. This feature is intended to be useful in preventing races between PUT operations.
WebDAV also supports Etags though how it's used may depend on the implementation:
Note that the meaning of an ETag in a PUT response is not clearly defined either in this document or in RFC 2616 (i.e., whether the ETag means that the resource is octet-for-octet equivalent to the body of the PUT request, or whether the server could have made minor changes in the formatting or content of the document upon storage). This is an HTTP issue, not purely a WebDAV issue.
If you are implementing your own client, I would do something like this:
ETag
If-None-Matches
headerFrom your updated question, it now seems clear that when a PUT request is received, you want to check ALL resources on the server for the absence of the same content before the request is accepted. That means also checking resources which are in a different location than what was specified as the destination to the PUT request.
AFAIK, there's no existing spec to specifically handle this case. However, the ETag mechanism (and the HTTP protocol) was designed to be generic and flexible enough to handle many cases and this is one of them.
Of course, this just means you can't take advantage of standard HTTP server logic -- you'd need to custom code both the client and server side.
Before I get into possible implementations, there are some assumptions that need to be made.
These have been ordered from simplest to increasing complexity if the simple case doesn't work for you.
This assumes your server implementation allows you to read the request headers and respond before the entire request is received.
If-None-Match
containing the ETag and continue sending the body normally.This is slightly more complex, but better adheres to the HTTP spec. Also, this MIGHT work if your server architecture doesn't allow you to read the headers before the entire request is received.
If-None-Match
containing the ETag and an Expect: 100-continue
header. The request body is NOT yet sent at this point.This implementation probably requires the most work but should be broadly compatible with all major libraries / architectures. There's a small risk of another client uploading a file with the same contents in between the two requests though.
/check-etag/<etag>
where <etag>
is the ETag. This checks whether the ETag already exists at the server./check-etag/*
checks to see if a resource with that ETag already exists.Although the implementation is up to you, here are some points to consider:
If an origin server receives a request that does not include an Expect request-header field with the "100-continue" expectation, the request includes a request body, and the server responds with a final status code before reading the entire request body from the transport connection, then the server SHOULD NOT close the transport connection until it has read the entire request, or until the client closes the connection. Otherwise, the client might not reliably receive the response message.
Also, DO NOT close the connection from the server side without sending any status codes, as the client will most likely retry the request:
If an HTTP/1.1 client sends a request which includes a request body, but which does not include an Expect request-header field with the "100-continue" expectation, and if the client is not directly connected to an HTTP/1.1 origin server, and if the client sees the connection close before receiving any status from the server, the client SHOULD retry the request.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With