Question

I am looking for the Python equivalent of an awk script to split a file into 26 parts based on a flag in the record. This is because there are 26 different record types in one file, a hangover from hierarchical databases used by Burroughs in the 1970s. I expected to be able to open 26 files named f_A to f_Z instead of the traditional f1 and then stream out the records as I read them in without holding the whole lot in a buffer.

# Gawk original - split new valuation roll format into record types A-Z
# run gawk -F\| -f split.awk input_file
# creates A.raw, B.raw, .... Z.raw
# Oct 1995 
{ident = $8; 
file = ident".raw";
print $0 >> file}

So I thought I could make up a file handle and then call that with eval() or something to direct each record to the correct output.

for line in fileinput.input(src):
    parts = line.split('|')
    recType = parts[7]
    recFile = 'f_'+recType
    if not recType in openFiles:
        eval(recFile) = open(recType+".raw",'w') # how should this line be written?
    eval(recFile).write(line)
    # ....

I can get the name of the system file from f1.name and evaluate a variable to get the handle eg eval("f_A") but I cannot see how to open the file with a handle that is not hardcoded.

Was it helpful?

Solution

eval is something to be avoided and, fortunately, it is almost never needed. In this case, open(recType+".raw",'w') creates a file handle. You just need to associate it with recType. This is what dictionaries are for.

In the code below, openFiles is a dictionary. Every time that we encounter a new recType, we open a file for it and save its filehandle in openFiles under the key recType. Whenever we want to write to that file again, we just ask the dictionary for the file handle. Thus:

openFiles = {}
for line in fileinput.input(src):
    parts = line.split('|')
    recType = parts[7]
    if not recType in openFiles:
        openFiles[recType] = open('f_' + recType, 'w')
    openFiles[recType].write(line)
    # .... 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top