Consider the following python script

#test.py
import sys
inputfile=sys.argv[1]
with open(inputfile,'r') as f:
    for line in f.readlines():
        print line

with open(inputfile,'r') as f:
    for line in f.readlines():
        print line

Now I want to run test.py on a substituted process, e.g.,

python test.py <( cat file | head -10)

It seems the second f.readlines returns empty. Why is that and is there a way to do it without having to specify two input files?

有帮助吗?

解决方案

  • Why is that.
    • Process substitution works by creating a named pipe. So all the data consumed at the first open/read loop.
  • Is there a way to do it without having to specify two input files.
    • How about buffering the data before using it.

Here is a sample code

import sys
import StringIO
inputfile=sys.argv[1]

buffer = StringIO.StringIO()

# buffering
with open(inputfile, 'r') as f:
    buffer.write(f.read())

# use it 
buffer.seek(0)
for line in buffer:
    print line

# use it again
buffer.seek(0)
for line in buffer:
    print line

其他提示

readlines() will read all available lines from the input at once. This is why the second call returns nothing because there is nothing left to read. You can assign the result of readlines() to a local variable and use it as many times as you want:

import sys
inputfile=sys.argv[1]
with open(inputfile,'r') as f:
    lines = f.readlines()
    for line in lines:
        print line

    #use it again
    for line in lines:
        print line
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top