The difference between strong and weak ETags

2013-07-12

When working on the performance of websites, sooner or later you will run across the ETag header. ETags come in "strong" and "weak" variants, but what is the difference between these? How are they used?

First, we need to define what an "ETag" is. This takes us back to the age of HTTP 1.0, when files were static and bytes were precious. When a webserver sent a file to a browser, it also sent along the time when the file had been last modified in the Last-Modified field of the HTTP protocol. When the browser needed to ask for the file a second time, for example when the user revisited the page after a long while, it sent along the timestamp of the version it had in its cache in the If-Modified-Since field. If the file had not been modified since this time, the server did not need to send the entire file again (remember, bytes were precious!), but could reply with 304 Not Modified.

Fast-forward to dynamically generated web pages, and this timestamp based method is not enough any more. For one, it only has a resolution of one second, and changes may happen more often than that. More importantly, there may be several valid versions of the page at any given time, or the page may rapidly cycle through several versions in short order. In short, what was needed was something that identified the version rather than date it.

The ETag

Enter the ETag: instead of sending a Last-Modified timestamp, the server sends an arbitrary string in the ETag field. This is usually a hash of the contents or similar, but it doesn't matter, and the browser is not supposed to care about the nature of the string. When the time comes for the browser to ask for the file again, it can send along a If-None-Match field with the ETag of the version it still has on disk. If the version is still valid, the server sends back a 304 Not Modified answer again.

It gets better: if the browser has several versions of the file on disk, it will send a list of ETags in the If-None-Match field:

GET /page.html HTTP/1.1
If-None-Match "abc","def"

Now if any one of these versions is the one the browser is supposed to receive, the server will answer with a 304 Not Modified response, including the ETag of the version it meant. This is quite important: if you're writing web server code (e.g. any web application that generates HTTP headers), and use ETags, you need to send the correct ETag along with the 304, or the browser will have to guess which version actually "matched".

Incidently, there's also an If-Match field, used for things such as PUT and POST requests where the user agent (like a browser) wants to make sure they're overwriting the version they think they are, and that it hasn't changed in between calls. But that's rarely used in practice.

Formats and flavors

An ETag is an almost arbitrary string. The browser does not care about its contents, only that it can compare ETags via string comparison (like strcmp or similar) to see whether two ETags match or not. These are the only two outcomes. The only formatting requirements are that the string starts and end with double quote characters, '"', like so:

200 OK
Content-Type: image/png
ETag: "d41d8cd98f00b204e9800998ecf8427e"

In this example, the ETag is a hash of some sort, which is quite common but unimportant. Hashes are a "very sure" way of identifying any content in a compact manner, exactly what we want here.

The above example is a strong ETag. There is also a weak ETag, which is formatted like the strong one, except for the two ASCII characters 'W/' in front of it, outside the quotes:

200 OK
Content-Type: image/png
ETag: W/"d41d8cd98f00b204e9800998ecf8427e"

Strong vs weak ETags

Finally, we come to the difference between strong and weak ETags. Basically speaking, a strong ETag promises that content with the same ETag is byte-for-byte identical, while the weak ETag merely promises a "semantic equivalence". The result is that with a strong ETag, the browser can ask for a range of bytes of a given version, for example the second half of an aborted download. This is not the case with a weak ETag: the bytes are not guaranteed to be the same; the content may not even have the same length.

An example: a HTML page is dynamically rendered. For debugging / statistics purposes, this rendering contains the time in milliseconds that the server took to render the page, and the name of the server (assuming several are being load balanced). Let us say that one rendering takes "341.58" milliseconds, while another takes "362.4" milliseconds. Note how these two numbers are rendered with a different length: one takes 6 characters, the other 5. The rest of the page after these strings are therefore shifted by one character; asking for a subset in bytes will result in different markup, potentially in chaos.

On the other hand, if this time is the only difference between the pages, any human surfer would call them "equivalent". For a server to say "no, sorry, the version I rendered in 341.58 milliseconds is out of date, you will have to download the one I rendered in 362.4 milliseconds" is both ridiculous and inefficient. Far better to say "yes, the 341.58 millisecond one is fine, keep that one".

From this difference, a simple rule for when to use which ETag follows: use a strong ETag when you can guarantee (and require) a byte-for-byte identity, and a weak one when you cannot. This generally means using a strong ETag for static content and a weak one for dynamically generated content.

Pitfalls

Finally, a few pitfalls to consider with ETags:

1: Don't use the inode of a file as part of the ETag in a cluster. Apache, in its default configuration, used to use a combination of the file size, the modification time, and the inode (basically the "address") of the file. This works if you're on one server, but if several are serving the same file, the inodes will not match, making it much less usefull for cache efficiency. As of this writing, Apache no longer uses the inode by default.

2: Take Content-Encoding into account. Many web servers can conditionally compress content if the browser can handle it (and all modern ones can). Technically, this is a different content (even semantically), and needs a new ETag. I often check for browser compression capabilities, and change my ETag accordingly before passing the content through a post-processing compression module.

3: Don't forget the ETag on a 304 response! RFC 2616 even tells you not to. The reason: If-None-Match can ask for several ETags, and the browser will need to guess which one is valid if you don't tell it.