문제

I'm using import fileinput in a Python script running on an Ubuntu box.

I'm running the script on the command line with something along the lines of python myscript.py firstinputfile.txt secondinputfile.txt and inside myscript.py I am using for line in fileinput.input() to iterate over the lines. The problem I'm running into is that firstinputfile.txt and secondinputfile.txt both use Macintosh (\r) line endings, and fileinput.input() does not seem to be recognizing \r as a line delimiter.

Is there any way to force fileinput to recognize \r as a line delimiter?

I've considered preprocessing firstinputfile.txt and secondinputfile.txt to use \n line endings, but am hesitant for two reasons: i) I don't really want to emit additional files to manage and ii) I still want the input to fileinput to come from file arguments (not stdin after piping commands) so I can use fileinput.filename() and fileinput.filelineno().

Any suggestions?

도움이 되었습니까?

해결책

It turns out fileinput.input() supports an optional openhook parameter:

You can control how files are opened by providing an opening hook via the openhook parameter to fileinput.input() or FileInput(). The hook must be a function that takes two arguments, filename and mode, and returns an accordingly opened file-like object. Two useful hooks are already provided by this module.

Furthermore, the universal newline support document suggests that a file can be open to support Windows/Unix/Macintosh newlines with the rU mode:

Opening a file with the mode 'U' or 'rU' will open a file for reading in universal newline mode. All three line ending conventions will be translated to a "\n" in the strings returned by the various file methods such as read() and readline().

So, you can write a little function to pass as the openhook argument that will open the file in a manner which supports universal newlines:

def univ_file_read(name, mode):
    # WARNING: ignores mode argument passed to this function
    return open(name, 'rU')

Then, instead of:

for line in fileinput.input():

Use:

for line in fileinput.input(openhook=univ_file_read):

This seems to have done the trick for me, and \r is being recognized as a line delimiter now.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top