Question

I'm using subprocess to run a Unix process and then capture the output. Like so:

command_process = subprocess.Popen(
    [command],
    shell=True,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT,
)

command_output = command_process.communicate()[0]

log_file = open('command.log', 'a')
log_file.write(command_output)
log_file.write('\n')
log_file.close()

I'm saving the output of the process to a variable, command_output, which gets dumped into the command.log file.

  • How does python store data in variables?
    • Are they in a memory buffer, or are variable values written to disk?
  • Is it risky to do this if the output is VERY large?
  • What are safer alternatives?
Was it helpful?

Solution

  • Are variables in a memory buffer, or are variable values written to disk?

Variables are in a memory buffer. You can see this (on Linux) by running (if you only have one running python process)

pidof python | awk '{print "lsof -a -p "$1}' | bash

Sample output:

COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
python  19858 jazzpi  cwd    DIR    8,1    12288 135854 /home/jazzpi
python  19858 jazzpi  rtd    DIR    8,1     4096      2 /
python  19858 jazzpi  txt    REG    8,1  2728836 532918 /usr/bin/python2.7
python  19858 jazzpi  mem    REG    8,1   125424     57 /lib/i386-linux-gnu/libtinfo.so.5.9
python  19858 jazzpi  mem    REG    8,1   247008    237 /lib/i386-linux-gnu/libreadline.so.6.2
python  19858 jazzpi  mem    REG    8,1    21648 131133 /usr/lib/python2.7/lib-dynload/readline.so
python  19858 jazzpi  mem    REG    8,1  2965552 530650 /usr/lib/locale/locale-archive
python  19858 jazzpi  mem    REG    8,1   114788     13 /lib/i386-linux-gnu/libgcc_s.so.1
python  19858 jazzpi  mem    REG    8,1  1437864   4337 /lib/i386-linux-gnu/i686/cmov/libc-2.13.so
python  19858 jazzpi  mem    REG    8,1   148996   4334 /lib/i386-linux-gnu/i686/cmov/libm-2.13.so
python  19858 jazzpi  mem    REG    8,1    95896    129 /lib/i386-linux-gnu/libz.so.1.2.7
python  19858 jazzpi  mem    REG    8,1     9800   4326 /lib/i386-linux-gnu/i686/cmov/libutil-2.13.so
python  19858 jazzpi  mem    REG    8,1     9844   4330 /lib/i386-linux-gnu/i686/cmov/libdl-2.13.so
python  19858 jazzpi  mem    REG    8,1   117009   4327 /lib/i386-linux-gnu/i686/cmov/libpthread-2.13.so
python  19858 jazzpi  mem    REG    8,1    26064 523330 /usr/lib/i386-linux-gnu/gconv/gconv-modules.cache
python  19858 jazzpi  mem    REG    8,1   117960     35 /lib/i386-linux-gnu/ld-2.13.so
python  19858 jazzpi    0u   CHR  136,3      0t0      6 /dev/pts/3
python  19858 jazzpi    1u   CHR  136,3      0t0      6 /dev/pts/3
python  19858 jazzpi    2u   CHR  136,3      0t0      6 /dev/pts/3

You can see it doesn't open any files to write to them. Also, from StackOverflow description of tag variable:

A variable is a named data storage location in memory.

However, when your computer is running out of memory, it can write contents of the RAM to the disk (i. e. pagefile.sys on Windows, swap partition on Linux). This also usually happens when you send your OS into Hibernate mode.

  • Is it risky to do this if the output is VERY large?

Writing to swap is a last resort of your OS and is something you should try to avoid, since reading from disk is a LOT slower than reading from RAM (also see Teach yourself programming in 10 years).

  • What are safer alternatives?

So if you're dealing with huge outputs that are big enough to possibly overflow your PC's RAM, you should consider dumping that data to a temporary file on your disk periodically (e. g. every 10MB or so), if possible.

If you want to read from an output stream, I'd also recommend you have a look at this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top