I'd like to know some kind of file checksum (like SHA-256 hash, or anything else) when I start downloading file from HTTP server. It could be transferred as one of HTTP response headers.
HTTP etag
is something similar, but it's used only for invalidating browser cache and, from what I've noticed, every site is calculating it in different way and it doesn't look like any hash I know.
Some software download sites provide various file checksums as separate files to download (for example, latest Ubuntu 16.04 SHA1 hashes: http://releases.ubuntu.com/16.04/SHA1SUMS). Won't it be easier to just include them in HTTP response header and force browser to calculate it when download ends (and do not force user to do it manually).
I guess that whole HTTP-based Internet is working, because we're using TCP protocol, which is reliable and ensures received bytes are exactly same as one send by the server. But if TCP is so "cool", why do we check file hashes manually (see abouve Ubuntu example)? And lot of thing can go wrong during file download (client/server disk corruption, file modification on server side etc.). And if I'm right, everything could be fixed simply by passing file hash at download start.
A common approach to the problem of data integrity in a network protocol or distributed system, such as HTTP, is the use of digests, checksums, or hash values.
Response headers hold additional information about the response, like its location or about the server providing it. Representation headers contain information about the body of the resource, like its MIME type, or encoding/compression applied.
A response header is an HTTP header that can be used in an HTTP response and that doesn't relate to the content of the message. Response headers, like Age , Location or Server are used to give a more detailed context of the response.
Common Response HeadersThe first line of the response is mandatory and consists of the protocol ( HTTP/1.1),response code (200)and description (OK). The headers shown are: CONTENT-Type -This is Text/html which is a web page. It also includes the character set which is UTF-8.
Non TLS
or indirect
transfer.Maybe I know your doubt because I had the same question about the checksums, let's find it out.
There are two tasks to be considered:
And three protocol in this question:
Now we separate into two situations:
The TCP protocol
promise: the data from server is exactly same as the data client received, by checksum and ack.
The TLS protocol
promise: the server is authenticated (is truly ubuntu.com) and the data is not changed by any middleman.
So there is no need to add checksum header in HTTP protocol when doing HTTPS.
But when TLS is not enabled, forgery could happen: bad guy in middle gives a bad file to the client.
CDN
, by mirror, by offline way(usb disk).Many sites like ubuntu.com use 3-party CDN
to serve static files, which the CDN server is not managed by ubuntu.com.
http://releases.ubuntu.com/somefile.iso
redirect to http://59.80.44.45/somefile.iso
.
Now the checksum must be provided out-of-band because it is not authenticated we don't trust the connection. So checksum header in HTTP protocol is helpless in this situation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With