Question

When only one process can listen and accept to a given ip address and port combination then how do web servers scale in order to listen to millions of incoming request? This question is related to finding out outline of socket programming for scaling, hence I am not looking load balancing with hardware scaling solutions, neither I am looking for web-socket implementations but a basic low level socket programming.

If I have to write a web server listening to incoming http request, what should be the design outline of thread listening to the incoming request in order for it to scale to millions of incoming requests ?

Was it helpful?

Solution 2

If you want millions of concurrent connections, then you won't be able to achieve that. The limit is usually in the thousands. If, on the other hand, you want millions of requests served per second, then you may just be able to get away with it, but only if your request processing is of the most trivial kind and the response is very short.

Generally, if you are doing this in Java, you should not use the standard blocking java.io API because that engages two threads per every single connection: one for reading, one for writing. Instead use the Netty library, which allows asynchronous TCP communication, meaning that only a thread pool of modest size can handle a very large number of requests: the threads are never idle waiting for input to be received or for output to be sent. They get busy only processing the data.

OTHER TIPS

I've written for myself a little chat server (which basically do the same kind of things).

The design I went for was a first thread that accepts the connections (basically a while loop, iterating over accept() on a ServerSocket object instance). That thread is listening to connections.

And for every connection returned by accept(), I started a separated thread which handles and reads from that connection. All those threads references were stored in a synchronized List for further handling, like closing the connection later, and so on.

You are conflating 'listen to' and 'read from'. They aren't the same thing. 'Listen to' means that the server creates a listening socket from which connections can be accepted, in large numbers. Each of those connections than then be read from and written to.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top