How can python threads be programmed such that the user can distinguish between them using monitoring tools available in LINUX

StackOverflow https://stackoverflow.com/questions/22679400

Question

For example, I can name threads easily for reference within the python program:

#!/usr/bin/python
import time
import threading

class threadly(threading.Thread):
    def __init__(self, name):
        threading.Thread.__init__(self)
        self.name = name
    def run(self):

        while True:
            time.sleep(4)
            print "I am", self.name, "and I am barely awake."

slowthread=threadly("slowthread")
slowthread.start()

anotherthread=threadly("anotherthread")
anotherthread.start()

while True:

    time.sleep(2)

    print "I will never stop running"
    print "Threading enumerate:", threading.enumerate()
    print "Threading active_count:", threading.active_count()
    print

And the output looks like this:

I am slowthread and I am barely awake.
I am anotherthread and I am barely awake.
I will never stop running
Threading enumerate: [<_MainThread(MainThread, started 140121216169728)>, <threadly(slowthread, started 140121107244800)>, <threadly(anotherthread, started 140121026328320)>]
Threading active_count: 3

I will never stop running
Threading enumerate: [<_MainThread(MainThread, started 140121216169728)>, <threadly(slowthread, started 140121107244800)>, <threadly(anotherthread, started 140121026328320)>]
Threading active_count: 3

I can find the PID this way:

$ ps aux | grep test
    557      12519  0.0  0.0 141852  3732 pts/1    S+   03:59   0:01 vim test.py
    557      13974  0.0  0.0 275356  6240 pts/2    Sl+  05:36   0:00 /usr/bin/python ./test.py
    root     13987  0.0  0.0 103248   852 pts/3    S+   05:39   0:00 grep test

I can then invoke top:

# top -p 13974

Pressing 'H' turns on display of threads, and we see they are all displaying as the name of the command or of the main thread:

top - 05:37:08 up 5 days,  4:03,  4 users,  load average: 0.02, 0.03, 0.00
Tasks:   3 total,   0 running,   3 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.8%us,  2.7%sy,  0.0%ni, 95.3%id,  0.0%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  32812280k total, 27717980k used,  5094300k free,   212884k buffers
Swap: 16474104k total,     4784k used, 16469320k free, 26008752k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13974 justin.h  20   0  268m 6240 1740 S  0.3  0.0   0:00.03 test.py
13975 justin.h  20   0  268m 6240 1740 S  0.0  0.0   0:00.00 test.py
13976 justin.h  20   0  268m 6240 1740 S  0.0  0.0   0:00.00 test.py

Contrast this with software like rsyslog which does name its threads:

# ps aux | grep rsyslog
root      2763  0.0  0.0 255428  1672 ?        Sl   Mar22   6:53 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root      2774 47.7  0.0 265424  6276 ?        Sl   Mar22 3554:26 /sbin/rsyslogd -i /var/run/syslogd-01.pid -c5 -f /etc/rsyslog-01.conf
root      2785  2.7  0.0 263408  3596 ?        Sl   Mar22 207:46 /sbin/rsyslogd -i /var/run/syslogd-02.pid -c5 -f /etc/rsyslog-02.conf
root      2797  1.7  0.0 263404  3528 ?        Sl   Mar22 131:39 /sbin/rsyslogd -i /var/run/syslogd-03.pid -c5 -f /etc/rsyslog-03.conf
root      2808 24.3  0.0 265560  3352 ?        Sl   Mar22 1812:25 /sbin/rsyslogd -i /var/run/syslogd-04.pid -c5 -f /etc/rsyslog-04.conf
root      2819  1.3  0.0 263408  1596 ?        Sl   Mar22 103:42 /sbin/rsyslogd -i /var/run/syslogd-05.pid -c5 -f /etc/rsyslog-05.conf
root      2830  0.0  0.0 263404  1408 ?        Sl   Mar22   0:17 /sbin/rsyslogd -i /var/run/syslogd-06.pid -c5 -f /etc/rsyslog-06.conf
root     13994  0.0  0.0 103248   852 pts/3    S+   05:40   0:00 grep rsyslog

Let's pick '2774' because it looks busy:

$ top -p 2774

And press 'H' and we see a descriptively named thread showing me that the thread dedicated to the 'main' ruleset and the Reg Queue is consuming 55.6 % CPU.

top - 05:50:52 up 5 days,  4:17,  4 users,  load average: 0.00, 0.00, 0.00
Tasks:   4 total,   1 running,   3 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.7%us,  2.6%sy,  0.0%ni, 95.5%id,  0.0%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  32812280k total, 29833152k used,  2979128k free,   214836k buffers
Swap: 16474104k total,     4784k used, 16469320k free, 28123448k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2775 root      20   0  259m 6020 1212 R 55.6  0.0   3152:40 rs:main Q:Reg
 2776 root      20   0  259m 6020 1212 S  7.0  0.0 407:57.94 rsyslogd
 2774 root      20   0  259m 6020 1212 S  0.0  0.0   0:00.00 rsyslogd
 2777 root      20   0  259m 6020 1212 S  0.0  0.0   0:00.00 rsyslogd

Another way to see the names is:

$ grep Name /proc/2775/task/*/status
/proc/2775/task/2774/status:Name:       rsyslogd
/proc/2775/task/2775/status:Name:       rs:main Q:Reg
/proc/2775/task/2776/status:Name:       rsyslogd
/proc/2775/task/2777/status:Name:       rsyslogd

So to restate my question:

How can python threads be programmed such that the user can distinguish between them using monitoring tools available in LINUX

In my question, I've tried to accomplish this by naming the thread within Python. Perhaps there a better way to expose differently identifiable threads from the OS?

Also, preferably I am looking for a Pythonic and standard way of doing this such that it would be part of the standard python distribution (RHEL 6/Python 2.6.7 specifically but this shouldn't matter unless the support comes in a later version of python) . Contributed modules are good to know about, but for my intended application unfortunately they would not be allowed for supportability reasons due to policies.

Was it helpful?

Solution

http://code.google.com/p/procname/

This appears to be your solution..

class worker(Thread):
    def __init__(self, name)
        Thread.__init__(self)
        self.name = name
        self.alive = True
        self.start()
    def run(self):
        procname.setprocname('My super name')
        while self.alive is True:
            ## Do work

x = worker('Worker')
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top