Question

I have some scripts that ought to have stopped running but hang around for ever.

Is there some way I can figure out what they're writing to stdout and stderr in a readable way ?

I tried, for example, to do

tail -f /proc/(pid)/fd/1

but that doesn't really work. It was a long shot anyway.

Any other ideas ? strace on its own is quite verbose and unreadable for seeing this.

Note: I am only interested in their output, not in anything else. I'm capable of figuring out the other things on my own; this question is only focused on getting access to stdout and stderr of the running process after starting it.

Was it helpful?

Solution

I'm not sure if it will work for you, but I read a page a while back describing a method that uses gdb

OTHER TIPS

Since I'm not allowed to edit Jauco's answer, I'll give the full answer that worked for me (Russell's page relies on unguaranteed behaviour that, if you close fd 1 for stdout, the next creat call will open fd 1.

So, run a simple endless script like this:

import time

while True:
    print 'test'
    time.sleep(1)

Save it to test.py, run with

python test.py

Get the pid:

ps auxw | grep test.py

Now, attach gdb:

gdb -p (pid)

and do the fd magic:

(gdb) call creat("/tmp/stdout", 0600)
$1 = 3
(gdb) call dup2(3, 1)
$2 = 1

Now you can tail /tmp/stdout and see the output that used to go to stdout.

There's several new utilities that wrap up the "gdb method" and add some extra touches. The one I use now is called "reptyr" ("Re-PTY-er"). In addition to grabbing STDERR/STDOUT, it will actually change the controlling terminal of a process (even if it wasn't previously attached to a terminal).

The best use of this is to start up a screen session, and use it to reattach a running process to the terminal within screen so you can safely detach from it and come back later.

It's packaged on popular distros (Ex: 'apt-get install reptyr').

http://onethingwell.org/post/2924103615/reptyr

Gdb method seems better, but you can do this with strace, too:

strace -p -e write=1 -s 1024 -o file

   -e write=set
               Perform a full hexadecimal and ASCII dump of all the
               data written to file descriptors listed in the spec-
               ified  set.  For example, to see all output activity
               on file descriptors 3 and 5 use -e write=3,5.   Note
               that  this is independent from the normal tracing of
               the write(2) system call which is controlled by  the
               option -e trace=write.

This prints out somewhat more thanyou need (the hexadecimal part), but you can sed that out easily.

I used strace and de-coded the hex output to clear text:

PID=some_process_id
sudo strace -f -e trace=write -e verbose=none -e write=1,2 -q -p $PID -o "| grep '^ |' | cut -c11-60 | sed -e 's/ //g' | xxd -r -p"

I combined this command from other answers.

strace outputs a lot less with just -ewrite (and not the =1 suffix). And it's a bit simpler than the gdb method, imo.

I used it to see the progress of an existing MythTV encoding job (sudo because I don't own the encoding process):

$ ps -aef | grep -i handbrake
mythtv   25089 25085 99 16:01 ?        00:53:43 /usr/bin/HandBrakeCLI -i /var/lib/mythtv/recordings/1061_20111230122900.mpg -o /var/lib/mythtv/recordings/1061_20111230122900.mp4 -e x264 -b 1500 -E faac -B 256 -R 48 -w 720
jward    25293 20229  0 16:30 pts/1    00:00:00 grep --color=auto -i handbr

$ sudo strace -ewrite -p 25089
Process 25089 attached - interrupt to quit
write(1, "\rEncoding: task 1 of 1, 70.75 % "..., 73) = 73
write(1, "\rEncoding: task 1 of 1, 70.76 % "..., 73) = 73
write(1, "\rEncoding: task 1 of 1, 70.77 % "..., 73) = 73
write(1, "\rEncoding: task 1 of 1, 70.78 % "..., 73) = 73^C

You can use reredirect (https://github.com/jerome-pouiller/reredirect/).

Type

reredirect -m FILE PID

and outputs (standard and error) will be written in FILE.

reredirect README also explains how to restore original state of process, how to redirect to another command or to redirect only stdout or stderr.

You don't state your operating system, but I'm going to take a stab and say "Linux".

Seeing what is being written to stderr and stdout is probably not going to help. If it is useful, you could use tee(1) before you start the script to take a copy of stderr and stdout.

You can use ps(1) to look for wchan. This tells you what the process is waiting for. If you look at the strace output, you can ignore the bulk of the output and identify the last (blocked) system call. If it is an operation on a file handle, you can go backwards in the output and identify the underlying object (file, socket, pipe, etc.) From there the answer is likely to be clear.

You can also send the process a signal that causes it to dump core, and then use the debugger and the core file to get a stack trace.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top