What's wrong with Indy? It ships with Delphi, and its TIdTCPServer
component does everything you are asking for. It accepts new connections using a separate worker thread per listening port, so the main thread is not waiting. Each accepted client runs in its own worker thread. And client threads can optionally be pooled (despite what you think, a pool does not have to limit how many connections you can accept, just how many threads are allowed to sit idle at any given moment waiting to be reused).
If you are having speed issues with it, feel free to report it to Indy's developers. I suspect your speed issues are likely to be related to how you are using it, rather than being issues with it itself.