Question

I realize, that my question is a very simple one, but I can't find any explicit example of the implementation of the stdin stdout into a Python script.

I have a script, working perfectly well with command line arguments:

newlist = []
def f1()
  .... 
def f2(input_file):
  vol_id = sys.argv[3]
  for line in input_file:
      if ... :
        line = line.replace('abc','def')
        line = line.replace('id', 'id'+vol_id)
      ....
      newlist.append(line)
   return newlist

def main():
    if len(sys.argv) < 4:
       print 'usage: ./myscript.py [file_in... file_out... volume_id]'
       sys.exit(1)

    else:

        filename = sys.argv[1]
        filename_out = sys.argv[2]


        tree = etree.parse(filename)
        extract(tree)

        input_file = open(filename, 'rU')
        change_class(input_file)

        file_new = open(filename_out, 'w')
        for x in newlist:

            if '\n' in x:                   
               x = x.replace('\n', '')                
            print>>file_new, x

Now I should somehow use stdin and stdout instead of my arguments in order to make my script usable within pipelines, like for example using multiple files as input:

cat input1 input1 input3 | myscript.py

Or to process its output with some UNIX tools before printing it to a file. I tried to replace arguments in my script by sys.stdin:

filename = sys.stdin
filename_out = sys.stdout

Then I ran my script like this:

./myscript.py < inputfile > outputfile

It resulted in an empty outputfile, but didn't yeld any error messages at all.

Could you please help me with this replacement?

P.S. Then I modified my main() like this:

filename = sys.argv[1]
filename_out = sys.argv[2]

if filename == '-':
   filename = sys.stdin
else:
    input_file = open(filename, 'rU')


if filename_out == '-':
    filename_out = sys.stdout
    file_new = filename_out
else:
    file_new = open(filename_out, 'w')


tree = etree.parse(filename)
extract(tree)

input_file = filename
change_class(input_file)

for x in newlist:

    if '\n' in x:                   
       x = x.replace('\n', '')                
    print>>file_new, x

I tried to run it from the command line like this:

./myscript.py - - volumeid < filein > fileout

But I still got an empty output file :(

Was it helpful?

Solution

The common placeholder for stdin or stdout is -:

./myscript.py - - volumeid

and:

if filename == '-':
    input_file = sys.stdin
else:
    input_file = open(filename, 'rU')

etc.

In addition, you could default filename and filename_out to - when there are fewer than 3 command line arguments. You should consider using a dedicated command-line argument parser such as argparse, which can handle these cases for you, including defaulting to stdin and stdout, and using -.

As a side note, I'd not use print to write to a file; I'd just use:

file_new.write(x)

which removes the need to strip off the newlines as well.

You appear to read from the input file twice; once to parse the XML tree, once again to call change_class() with the open file object. What are you trying to do there? You'll have problems replicating that with sys.stdin as you cannot re-read the data from a stream the way you can from a file on disk.

You'd have to read all the data into memory first, then parse the XML from it, then read it it again for change_class(). It'd be better if you used the parsed XML tree for this instead, if possible (e.g. read the file only once, then use the parsed structure from there on out).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top