Question

I have very little experience working with sockets and multithreaded programming so to learn more I decided to see if I could hack together a little python socket server to power a chat room. I ended up getting it working pretty well but then I noticed my server's CPU usage spiked up over 100% when I had it running in the background.

Here is my code in full: http://gist.github.com/332132

I know this is a pretty open ended question so besides just helping with my code are there any good articles I could read that could help me learn more about this?

My full code:

import select 
import socket 
import sys
import threading 
from daemon import Daemon

class Server: 
def __init__(self): 
    self.host = '' 
    self.port = 9998 
    self.backlog = 5 
    self.size = 1024 
    self.server = None 
    self.threads = []
    self.send_count = 0

def open_socket(self): 
    try: 
        self.server = socket.socket(socket.AF_INET6, socket.SOCK_STREAM) 
        self.server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.server.bind((self.host,self.port)) 
        self.server.listen(5) 
        print "Server Started..."
    except socket.error, (value,message): 
        if self.server: 
            self.server.close() 
        print "Could not open socket: " + message 
        sys.exit(1) 

def remove_thread(self, t):
    t.join()

def send_to_children(self, msg):
    self.send_count = 0
    for t in self.threads:
        t.send_msg(msg)
    print 'Sent to '+str(self.send_count)+" of "+str(len(self.threads))

def run(self): 
    self.open_socket() 
    input = [self.server,sys.stdin] 
    running = 1 
    while running: 
        inputready,outputready,exceptready = select.select(input,[],[]) 

        for s in inputready: 
            if s == self.server: 
                # handle the server socket 
                c = Client(self.server.accept(), self) 
                c.start() 
                self.threads.append(c)
                print "Num of clients: "+str(len(self.threads))

    self.server.close() 
    for c in self.threads: 
        c.join() 

class Client(threading.Thread): 
def __init__(self,(client,address), server): 
    threading.Thread.__init__(self) 
    self.client = client 
    self.address = address 
    self.size = 1024
    self.server = server
    self.running = True

def send_msg(self, msg):
    if self.running:
        self.client.send(msg)
        self.server.send_count += 1

def run(self):
    while self.running: 
        data = self.client.recv(self.size) 
        if data:
            print data
            self.server.send_to_children(data)
        else: 
            self.running = False
            self.server.threads.remove(self)
            self.client.close()

"""
Run Server
"""

class DaemonServer(Daemon):
def run(self):
    s = Server()
    s.run()

if __name__ == "__main__": 
d = DaemonServer('/var/servers/fserver.pid')
if len(sys.argv) == 2:
    if 'start' == sys.argv[1]:
        d.start()
    elif 'stop' == sys.argv[1]:
        d.stop()
    elif 'restart' == sys.argv[1]:
        d.restart()
    else:
        print "Unknown command"
        sys.exit(2)
    sys.exit(0)
else:
    print "usage: %s start|stop|restart" % sys.argv[0]
    sys.exit(2)
Was it helpful?

Solution

There are several possible race conditions in your code, but they would threaten correctness rather than performance: fixing them e.g. by locking would definitely not improve performance.

Rather, I'd focus on what good you think those threads are doing, at all -- since the core of your code is a select.select call, why not focus on that... and a totally asynchronous, thus more effective... server, instead of bouncing some tasks off to threads which basically are just overhead. Read when some input is ready (as you're doing), write when some socket is ready for output, &c -- it's simpler and faster than the current mix of threads and async.

Programming async servers directly on top of select.select is quite a low-level approach, and while instructive it's not really suitable for production. Consider using the asyncore and asynchat modules of the Python standard library for a modestly higher abstraction level, or the twisted third-party package for a much higher boost (including the ability to implement the underlying "Reactor" design pattern by more effective means than old select -- there's poll, kqueues, etc, depending on what OS you're on, and Twisted can let you choose the implementation depending on your platform, while keeping the same Reactor interface).

I think I cover these various possibilities decently, if concisely, in the "server-side sockets" chapter of Python in a Nutshell 2nd Ed -- which you can get for free online by getting a trial subscription to O'Reilly's "Safari Online" site, or (illegally;-) by finding and using one of the many pirate sites hosting pirate copies of books (assuming of course you don't want to spend money for it by getting it "all legal and proper";-). I think you can freely download a zipfile with all example code from O'Reilly's website, anyway.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top