Is it the filename or the whole URL used as a key in browser caches?

https://stackoverflow.com/questions/83990

01-07-2019
|

Question

It's common to want browsers to cache resources - JavaScript, CSS, images, etc. until there is a new version available, and then ensure that the browser fetches and caches the new version instead.

One solution is to embed a version number in the resource's filename, but will placing the resources to be managed in this way in a directory with a revision number in it do the same thing? Is the whole URL to the file used as a key in the browser's cache, or is it just the filename itself and some meta-data?

If my code changes from fetching /r20/example.js to /r21/example.js, can I be sure that revision 20 of example.js was cached, but now revision 21 has been fetched instead and it is now cached?

Solution

Yes, any change in any part of the URL (excluding HTTP and HTTPS protocols changes) is interpreted as a different resource by the browser (and any intermediary proxies), and will thus result in a separate entity in the browser-cache.

Update:

The claim in this ThinkVitamin article that Opera and Safari/Webkit browsers don't cache URLs with ?query=strings is false.

Adding a version number parameter to a URL is a perfectly acceptable way to do cache-busting.

What may have confused the author of the ThinkVitamin article is the fact that hitting Enter in the address/location bar in Safari and Opera results in different behavior for URLs with query string in them.

However, (and this is the important part!) Opera and Safari behave just like IE and Firefox when it comes to caching embedded/linked images and stylesheets and scripts in web pages - regardless of whether they have "?" characters in their URLs. (This can be verified with a simple test on a normal Apache server.)

(I would have commented on the currently accepted answer if I had the reputation to do it. :-)

OTHER TIPS

Browser cache key is a combination of the request method and resource URI. URI consists of scheme, authority, path, query, and fragment.

Relevant excerpt from HTTP 1.1 specification:

The primary cache key consists of the request method and target URI. However, since HTTP caches in common use today are typically limited to caching responses to GET, many caches simply decline other methods and use only the URI as the primary cache key.

Relevant excerpt from URI specification:

The generic URI syntax consists of a hierarchical sequence of components referred to as the scheme, authority, path, query, and fragment.
URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

hier-part   = "//" authority path-abempty
              / path-absolute
              / path-rootless
              / path-empty

I am 99.99999% sure that it is the entire url that is used to cache resources in a browser, so your url scheme should work out fine.

The MINIMUM you need to identify an HTTP object is by the full path, including any query-string parameters. Some browsers may not cache objects with a query string but that has nothing to do with the key to the cache.

It is also important to remember that the the path is no longer sufficient. The Vary: header in the HTTP response alerts the browser (or proxy server, etc.) of anything OTHER than the URL which should be used to determine the cache key, such as cookies, encoding values, etc.

To your basic question, yes, changing the URL of the .js file is sufficent. TO the larger question of what determines the cache key, it's the URL plus the Vary: header restrictions.

Yes. A different path is the same from the caches perspective.

Of course it has to use the whole path '/r20/example.js' vs '/r21/example.js' could be completely different images to begin with. What you suggest is a viable way to handle version control.

In most browsers the full url is used. In some browsers, if you have a query in the url, the document will never be cached.

Entire url. I've seen a strange behavior in a few older browsers where case sensitivity came into play.

In addition to the existing answers I just want to add that it might not apply if you use ServiceWorkers or e.g offline-plugin. Then you could experience different cache rules depending on how the ServiceWorkers are set up.

depends. it is supposed to be the full URL, but some browsers (Opera, Safari2) apply a different cache strategy for urls with different params.

best bet is to change the name of the file.

There is a very clever solution here (uses PHP, Apache)

http://verens.com/archives/2008/04/09/javascript-cache-problem-solved/

Strategy notes: “According the letter of the HTTP caching specification, user agents should never cache URLs with query strings. While Internet Explorer and Firefox ignore this, Opera and Safari don’t - to make sure all user agents can cache your resources, we need to keep query strings out of their URLs.”

http://www.thinkvitamin.com/features/webapps/serving-javascript-fast

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow