Introduction
HTTP 304 Not Modified
and relevant headers provide a cache mechanism that is widely used in website. It can reduce network traffic and avoid unnecessary response overhead. Most of the web servers such as IIS, Apache and Nginx support this mechanism for static resources (JavaScript, CSS, image file, etc.) by default so we do not feel its existence. But it actually has an important role in the web development. Understanding the story behind the scenes could be very beneficial when we are building our web application.
The Mechanism
The story of HTTP 304 Not Modified
could be summarized by the following steps:
- Client sends a normal request to acquire content from server.
- Server responds a header value along with the content to indicate the current state. Client will preserve the header value for future requests.
- In future requests, client sends a header with preserved value to question server whether the state has changed or not. Response from server could be:
- If state has changed, server repeats step 1 to respond the latest content along with the state. Client renders the content and saves it in cache.
- If state does not change, server respond
HTTP 304 Not Modified
without content. Client renders the content from cache.
Figure 1 is the flow chart on the server side.
Figure 1
In step 2, server has two options to indicate the state of current content.
- To respond a
Last-Modified
date-time value which represents when the last change occurred. Client in next request will send an If-Modified-Since
header value to question server the change. - To respond an
ETag
unique identifier which was generated whenever the change occurred. In this case, client will send an If-None-Match
header value to question server the change in next request.
Walkthrough
Let us take a look at a few examples by starting from a simple request.
GET /dota2/sheever.png HTTP/1.1
Host: localhost:8000
In this example, client tries to render a PNG file. Because there is no If-Modified-Since
or If-None-Match
header present, server knows this is the first request of this image. We will see each of the options from server in the following section.
Option 1
In the case of the first option, server indicates the creation/modified time of this image from file system.
HTTP/1.1 200 OK
Date: Mon, 18 Sep 2015 22:19:01 GMT
Content-Type: image/png
Content-Length: 1393216
Last-Modified: Thu, 15 Sep 2015 22:10:34 GMT
(PNG Content)
After receiving response from server, client stores this picture and Last-Modified
header value in cache. With this information, client is able to question server with If-Modified-Since
header value in future requests. Suppose a user just pressed F5 to reload the page, the second request to the same image will be:
GET /dota2/sheever.png HTTP/1.1
Host: localhost:8000
If-Modified-Since: Thu, 15 Sep 2015 22:10:34 GMT
Now server will check whether this image has been edited since Thu, 15 Sep 2015 22:10:34 GMT
. If the answer is yes, server can just treat this request as normal one and updates Last-Modified
header value to the date-time of latest change. And if no, the response will be very simple:
HTTP/1.1 304 Not Modified
Date: Mon, 18 Sep 2015 22:20:01 GMT
Notice that no content is given for this status code. The response ends here immediately and the client will render image from cache. This is how this mechanism reduces the network traffic.
If we apply Figure 1 to this option, it can be changed to Figure 2 as below:
Figure 2
Option 2
In the second option, server gives a unique identifier that represents the latest change of this image. Because the format of ETag
value is not strictly defined in HTTP 1.1, we use GUID in this example. (In addition, the method of generating ETag
value does not matter. What matters is whether the server can recognize the change behind it.)
HTTP/1.1 200 OK
Date: Mon, 18 Sep 2015 22:19:01 GMT
Content-Type: image/png
Content-Length: 1393216
ETag: "82b600b9-c321-4f34-8065-cec076cfbacc"
(PNG Content)
Now client is able to question server with If-None-Match
header value in future requests. This header is used for test if the value held by client is the latest one.
GET /dota2/sheever.png HTTP/1.1
Host: localhost:8000
If-None-Match: "82b600b9-c321-4f34-8065-cec076cfbacc"
Because server has to regenerate and maintain ETag
value whenever this image is changed, server is able to recognize whether the value is expired. If the answer is yes, server responds latest content along with updated ETag
header value. And if no, the response will be similar to what we just saw in option 1:
HTTP/1.1 304 Not Modified
Date: Mon, 18 Sep 2015 22:20:01 GMT
Same, no content is given for this status code. The response ends here and client will render image from its cache.
We can now change Figure 1 to Figure 3.
Figure 3
In the next section of this article, we will discuss option selection depending on the characteristic of content.
The Good and The Bad
You have probably noticed that there is a problem in practice in option 2. Let us take a look back at the example and review this line.
Quote:
server has to regenerate and maintain ETag value whenever this image is changed
The question is: how does server accomplish it? How does server notice the change in real-time? In .NET Framework, we can use FileSystemWatcher
in our web application to watch over the files. In Java, we can use WatchService
and so on in other platforms. But no matter the APIs, watching file system is always costly to our web application, especially watching multiple files in multiple directories.
Option 1 is relatively easier to implement in this example because server is able to retrieve file creation/modified time from file system anytime. Server knows whether the file has been changed since the time that given in If-Modified-Since
header. Therefore, it does not have to watch over the files.
But there is also a problem in option 1. File creation/modified time could be easily modified by other applications without changing its content. Thus, it is not a very reliable solution. Additionally, the datetime format used by HTTP 1.1, the RFC1123 standard does not include millisecond part, which means if content is changed more than 2 times within one second, the Last-Modified
header value is unable to reflect these changes.
So which option is better? It will be highly depending on the characteristic of content. If the content is an image file, a CSS file or something that is relatively insignificant in our web application, option 1 could be suitable. This is exactly how most of CDN (Content Delivery Network) services deliver these types of content. On the other hand, if the content is a dynamic generated JavaScript statement, JSONP or objects that we would like to temporarily keep in memory, then option 2 is the better choice. We can apply a simple design pattern such as <a href="https://msdn.microsoft.com/en-us/library/system.componentmodel.inotifypropertychanged%28v=vs.110%29.aspx" target="_blank">INotifyPropertyChanged</a>
interface in NET. Framework to make our web application easily receive the change against the content. Therefore, maintaining ETag
value for content is much easier to be accomplished.
Conclusion
The web development frameworks are always evolving and new frameworks are being invented all the time. But no matter what framework you are using, paying a little more attentions on HTTP itself and understanding its inherent mechanisms could facilitate building a better web application. HTTP 304 Not Modified
is exactly one of them.
Further Reading