What's a good method of programatically generating etag for web pages, and is this practice recommended? Some sites recommend turning etags off, others recommend producing them manually, and some recommend leaving the default settings active - what's the best way here?
The method by which ETags are generated has never been specified in the HTTP specification. Common methods of ETag generation include using a collision-resistant hash function of the resource's content, a hash of the last modification timestamp, or even just a revision number.
The ETag (or entity tag) HTTP response header is an identifier for a specific version of a resource. It lets caches be more efficient and save bandwidth, as a web server does not need to resend a full response if the content was not changed.
An ETag (entity tag) is an HTTP header that is used to validate that the client (such as a mobile device) has the most recent version of a record. When a GET request is made, the ETag is returned as a response header. The ETag also allows the client to make conditional requests.
Whenever a resource is requested (via its URL), the data and ETag are retrieved and stored in the Web cache, and the ETag is sent along with subsequent requests. If the ETag at the server has not changed, a "Not Modified" message is returned, and the cached data are used. See Web cache and browser cache.
I recommend generating a hash of the the content, e.g. md5($content)
.
Additionally, to prevent hash collision, you might want to add e.g. the ID of the content element to it (if this is appropriate).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With