reusable function: substituting the values returned by another function

https://stackoverflow.com/questions/5936387

30-10-2019
|

Question

Below is the snippet: I'm parsing job log and the output is the formatted result.

def job_history(f):

    def get_value(j,n):
        return j[n].split('=')[1]

    lines = read_file(f)
    for line in lines:
        if line.find('Exit_status=') != -1:
            nLine = line.split(';')
            jobID = '.'.join(nLine[2].split('.',2)[:-1]
            jData = nLine[3].split(' ')
            jUsr = get_value(jData,0)
            jHst = get_value(jData,9)
            jQue = get_value(jData,3)
            eDate = job_value(jData,14)

            global LJ,LU,LH,LQ,LE
            LJ = max(LJ, len(jobID))
            LU = max(LU, len(jUsr))
            LH = max(LH, len(jHst))
            LQ = max(LQ, len(jQue))
            LE = max(LE, len(eDate))

            print "%-14s%-12s%-14s%-12s%-10s" % (jobID,jUsr,eDate,jHst,jQue)

   return LJ,LU,LE,LH,LQ

In principle, I should have another function like this:

def fmt_print(a,b,c,d,e):
    print "%-14s%-12s%-14s%-12s%-10s\n" % (a,b,c,d,e)

to print the header and call the functions like this to print the complete result:

fmt_print('JOB ID','OWNER','E_DATE','R_HOST','QUEUE')
job_history(inFile)

My question is: how can I make fmt_print() to print both the header and the result using the values LJ,LU,LE,LH,LQ for the format spacing. the job_history() will parse a number of log files from the log-directory. The length of the field of similar type will differ from file to file and I don't wanna go static with the spacing (assuming the max length per field) for this as there gonna be lot more columns to print (than the example). Thanks in advance for your help. Cheers!!

PS. For those who know my posts: I don't have to use python v2.3 anymore. I can use even v2.6 but I want my code to be v2.4 compatible to go with RHEL5 default.

Update: 1

I had a fundamental problem in my original script. As I mentioned above that the job_history() will read the multiple files in a directory in a loop, the max_len were being calculated per file and not for the entire result. After modifying unutbu's script a little bit and following xtofl's (if this is what it meant) suggestion, I came up with this, which seems to be working.

def job_history(f):
    result=[]
    for line in lines:
        if line.find('Exit_status=') != -1:
            ....
            ....
            global LJ,LU,LH,LQ,LE
            LJ = max(LJ, len(jobID))
            LU = max(LU, len(jUsr))
            LH = max(LH, len(jHst))
            LQ = max(LQ, len(jQue))
            LE = max(LE, len(eDate))

            result.append((jobID,jUsr,eDate,jHst,jQue))

    return LJ,LU,LH,LQ,LE,result

# list of log files 
inFiles = [ m for m in os.listdir(logDir) ]

saved_ary = []
for inFile in sorted(inFiles):
    LJ,LU,LE,LH,LQ,result = job_history(inFile)
    saved_ary += result

# format printing
fmt_print = "%%-%ds  %%-%ds  %%-%ds  %%-%ds  %%-%ds" % (LJ,LU,LE,LH,LQ)
print_head = fmt_print % ('Job Id','User','End Date','Exec Host','Queue')
print '%s\n%s' % (print_head, len(print_head)*'-')

for lines in saved_ary:
    print fmt_print % lines

I'm sure there are lot other better ways of doing this, so suggestion(s) are welcomed. cheers!!

Update: 2

Sorry for brining up this "solved" post again. Later discovered, I was even wrong with my updated script, so I thought I'd post another update for future reference. Even though it appeared to be working, actually length_data were overwritten with the new one for every file in the loop. This works correctly now.

def job_history(f):

    def get_value(j,n):
        return j[n].split('=')[1]

    lines = read_file(f)
    for line in lines:
        if "Exit_status=" in line:
            nLine = line.split(';')
            jobID = '.'.join(nLine[2].split('.',2)[:-1]
            jData = nLine[3].split(' ')
            jUsr = get_value(jData,0)
            ....

        result.append((jobID,jUsr,...,....,...))
    return result

# list of log files 
inFiles = [ m for m in os.listdir(logDir) ]

saved_ary = []
LJ = 0; LU = 0; LE = 0; LH = 0; LQ = 0

for inFile in sorted(inFiles):
    j_data = job_history(inFile)
    saved_ary += j_data

for ix in range(len(saved_ary)):
    LJ = max(LJ, len(saved_ary[ix][0]))
    LU = max(LU, len(saved_ary[ix][1]))
    ....

# format printing
fmt_print = "%%-%ds  %%-%ds  %%-%ds  %%-%ds  %%-%ds" % (LJ,LU,LE,LH,LQ)
print_head = fmt_print % ('Job Id','User','End Date','Exec Host','Queue')
print '%s\n%s' % (print_head, len(print_head)*'-')

for lines in saved_ary:
    print fmt_print % lines

The only problem is it's taking a bit of time to start printing the info on the screen, just because, I think, as it's putting all the in the array first and then printing. Is there any why can it be improved? Cheers!!

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow