How non-blocking web server works?

Question 1

Basically the way the non-blocking sockets I/O work is by using polling and the state machine. So your scheme for many connections would be something like that:

Create many sockets and make them nonblocking
Switch the state of them to "connect"
Initiate the connect operation on each of them
Poll all of them until some events fire up
Process the fired up events (connection established or connection failed)
Switch the state those established to "sending"
Prepare the Web request in a buffer
Poll "sending" sockets for WRITE operation
send the data for those who got the WRITE event set
For those which have all the data sent, switch the state to "receiving"
Poll "receiving" sockets for READ operation
For those which have the READ event set, perform read and process the read data according to the protocol
Repeat if the protocol is bidirectional, or close the socket if it is not

Of course, at each stage you need to handle errors, and that the state of each socket is different (one may be connecting while another may be already reading).

Regarding polling I have posted an article about how different polling methods work here: http://www.ulduzsoft.com/2014/01/select-poll-epoll-practical-difference-for-system-architects/ - I suggest you check it.

Question 2

To benefit from a non-blocking server, your code must also be non-blocking - you can't just run blocking code on a non-blocking server and expect better performance. For example, you must remove all calls to sleep() and replace them with non-blocking equivalents like IOLoop.add_timeout (which in turn involves restructuring your code to use callbacks or coroutines).

Question 3

How To Use Linux epoll with Python http://scotdoyle.com/python-epoll-howto.html may give you some points about this topic.