Question

I wanted to learn more about DNS, and I happened to have a copy of Computer Networking: A Top-Down Approach 4th Edition lying around.

Section 2.5.1 (page 132) suggests using DNS as a load balancer:

DNS is also used to perform load distribution among replicated servers, such as replicated Web servers. Busy sites, such as cnn.com, are replicated over multiple servers, with each server running on a different end system and each having a different IP address. For replicated Web servers, a set of IP addresses is thus associated with one canonical hostname. The DNS database contains this set of IP addresses. When clients make a DNS query for a name mapped to a set of addresses, the server responds with the entire set of IP addresses, but rotates the ordering of the addresses within each reply. Because a client typically sends its HTTP request message to the IP address that is listed first in the set, DNS rotation distributes the traffic among the replicated servers.

On systems I've worked on, load distribution is done by directing all traffic to a load balancer or proxy, which forwards the request to a replica. The DNS-based strategy described here seems brittle:

  • You're dependent on the DNS server rotating the list of IP addresses it sends back to clients.
  • You're dependent on clients always taking the first IP in the list.
  • If you add or remove a server from your cluster, you have to propagate the change through DNS. It can take 24 - 48 hours for the majority of DNS caches to roll over and load your change (the book says so, and I have personal experience with this). If you have four servers and one crashes suddenly, a quarter of requests fail for the next 24 hours.
  • Since the load distribution mechanism lives outside your system, you don't have much to go on if one of your servers is getting overloaded.

This edition came out in 2008, and a lot has changed on the internet since then. Is this DNS-based load distribution strategy outdated? Are there still reasons to use it?

Was it helpful?

Solution

Even with the more modern load balancing techniques that you mention in place I think there might still be good reasons for using the DNS load balancing: probably less for balancing the load across the actual load balancing infrastructure (duh) but maybe for high availability reasons: that infrastructure may itself face outages or be in need of maintenance windows implying downtime. The DNS load balancing could help reduce or eliminate the impact of such events.

For example Google's App Engine infrastructure still uses/recommends DNS load balancing - the procedure for Mapping Custom Domains to an app makes it appear under 4 different IP addresses.

Licensed under: CC-BY-SA with attribution
scroll top