Question

This sort of question has been asked before HTTP Requests vs File Size?, but I'm hoping for a better answer. In that linked question, the answerer seemed to do a pretty good job of answering the question with the nifty formula of latency + transfer time with an estimated latency of 80 ms and transfer speed of 5Mb/s. But it seems flaw in at least one respect. Don't multiple requests and transfers happen simultaneously in a normal browsing experience? That's what it looks like when I examine the Network tab in Chrome. Doesn't this mean that request latency isn't such a terrible thing?

Are there any other things to consider? Obviously latency and and bandwidth will vary but is 80ms and 5Mb/s a good rule of thumb? I thought of an analogy and I wonder if it is correct. Imagine a train station with only one track in and one track out (or maybe it is one for both). Http requests are like sending an engine out to get a bunch of cars at another station. They return pulling a long train of railway cars, which represents the requested file being downloaded. So you could send one engine out and have it bring back a massive load. Or you could send multiple engines out and they could each bring back smaller loads, of course they would all have to wait their turn coming back into the station. And some engines couldn't be sent out until other ones had come in. Is this a flawed analogy?

I guess the big question then is how can you predict how much overlap there will be in http requests so that you can know, for example, if it is generally worth it to have two big PNG files on your page or instead have a webp image, plus the Webpjs js and swf files for incompatible browsers. That doubles the number of requests but more than halves the total file size (say 200kB savings).

Was it helpful?

Solution 2

This is the kind of answer I'm looking for. I did some simplistic tests to get a feel for the speed of many small files vs one large files.

I created html pages that loaded a bunch of random sized images from placekitten.com. I loaded them in Chrome with the Network tab open.

Here are some results:

# of imgs     Total Size (KB)   Time (ms)
1             465               4000, 550
1             307               3000, 800, 350, 550, 400

30            192               1200, 900, 800, 900
30            529               7000, 5000, 6500, 7500

So one major thing to note is that single files become much quicker after they have been loaded once. (The comma seperated list of times are page reloads). I did normal refresh and also Empty Cache and Hard Reload. Strangely it didn't seem to make much difference which way I refreshed.

My connection had a latency or return time or whatever of around 120 - 130ms and my download speed varied between 4 and 8Mbps. Chrome seemed to do about 6 requests at a time.

Looking at these few tests it seems that, in this range of file sizes at least, it is obviously better to have less requests when the file sizes are equal, but if you could cut the file size in half, even at the expense of increasing the number of http requests by 30, it would be worth it, at least for a fresh page load.

Any comments or better answers would be appreciated.

OTHER TIPS

Your analogy it's not bad in general terms. Obviously if you want to be really precise in all aspects, there're things that are oversimplified or incorrect (but that happens with almost all analogies).

Your estimate of 80ms and 5mb/s might sound logical, but even though most of us likes theory, you should manage this kind of problems in another way.

In order to make good estimates, you should measure to get some data and analyze it. Every estimation depends on some context and you should not ignore it.

Think about not being the same estimating latency and bandwidth for a 3G connection, an ADSL connection in Japan or an ADSL connection in a less technology-developed country. Are clients accessing from the other end of the world or in the same country?. Like your good observation of simultaneous connections on the client, there're millions of possible questions to ask yourself and very little good-quality answers without doing some measure.

I know I'm not answering exactly your question, because I think is unanswerable without so many details about the domain (plus constrains, and a huge etc).

You seem to have some ideas about how to design your solution. My best advice is to implement each one of those and profile them. Make measurements, try to identify what your bottlenecks are and see if you have some control about them.

In some problems this kind of questions might have an optimal solution, but the difference between optimal and suboptimal could be negligible in practice.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top