Writing multi server code

https://stackoverflow.com/questions/3219607

13-09-2020
|

Question

I've been wondering for a while; how does websites like facebook code to be able to have multiple servers?

How can the code take in account that several servers will be running the same code and gain from adding more?

Or does the webserver perhaps deal with this regardless of the code?

Solution

By sharing and networking. The code 'should' be the same for one server or several.

You can share data via Databases, memory with things like Memcache, the load with a balancer, etc. If you specialise servers like Google does (some do URL fetching, some hold data, some do number crunching, etc) the hardware to hand can be better utilised.

The code can use dispatch logic (normally abstracted via an API) so that it works the same if there is one server or millions of them.

IPC (Inter Process Communication) can be network enabled and allow a 'tighter' bonding of services. Google even have a protocol buffer project to aid with this.

Basically servers have to share to get any real benefits (beyond failover/backup), the code needs to use a level of abstraction to help with sharing. The actual sharing typically uses Round-Robin or Map/Reduce logic.

OTHER TIPS

The underlying architecture pattern is "shared-nothing architecture". The idea is to build the most heavily used parts of the archtecture in a way that it can be distributed and that distributed peers do not need to know anything about other peers, so they do not need to communicate with each other. That way they can be scaled by adding other peers.

Usually that requires some sort of traffic rounting (load balancing) for feeding the shared components and some persistence and/or state synchronization.

The "classical" architecture for this is one or more load balamcers distributing traffic to several "shared-nothing" application severs which run against a common database. Typically the appication server hardware is rather cheap and the database hardware is one or two big irons depending on load.

These days, more and more solutions also chop the database into pieces in order to scale it. Eventually that leads to distributed, sharded databases, where several db nodes exist and each node contains only a subset of the data.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow