Domanda

I have space-delimited files containing daily precipitation values for stations at different LAT/LON locations. The daily files have the following format as an example:

LAT LON PRCP

22.0 110.4 1.2

23.0 121.0 0.0

23.0 122.0 0.1

Where the first field equals Latitude, the second field equals Longitude, and the third field equals daily total precipitation.

I am looking to create a weekly file compiling the totals from each daily file for that week using the same format... but I'm running into issues. What makes this perhaps even slightly trickier for me is that each daily file may not have all locations, meaning that the number of rows may differ and I can't simply add the TOTAL PRCP field from each file row for row into the weekly file since it may not match for all days.

My current method was to open each file, iterate over each line, and set each field to a variable, then compare with the second daily file's variables and write a line with the sum of the two precipitation values if the LAT and LON fields match... then perform this for each day in comparison with the next day and write a "sum" file.

   with open(sundayFile, "r") as sundayFile:
    with open(mondayFile, "r") as mondayFile:
        with open(addMex1, "a") as addFile:

            print "\n\nNow checking Sunday File: " + str(sundayFile) + " and Monday File: " + str(mondayFile) + "\n\n"

            for lineA in sundayFile:
                parsedLineA = lineA.split()
                LAT_A = parsedLineA[0]
                LON_A = parsedLineA[1]
                TOTAL_PRCP_A = parsedLineA[2]

                print "Line in Sunday File: " + LAT_A + "," + LON_A + "," + TOTAL_PRCP_A + "\n"

                for lineB in mondayFile:
                    parsedLineB = lineB.split()
                    LAT_B = parsedLineB[0]
                    LON_B = parsedLineB[1]
                    TOTAL_PRCP_B = parsedLineB[2]

                    print "Line in Monday File: " + LAT_B + "," + LON_B + "," + TOTAL_PRCP_B + "\n"


                    if LAT_A == LAT_B and LON_A == LON_B:
                        print "\n***** Found a match for station at longitude of " + LON_A + " and latitude of " + LAT_A + "\n"
                        LAT = LAT_A
                        LON = LON_A
                        TOTAL_PRCP = str(float(TOTAL_PRCP_A) + float(TOTAL_PRCP_B))

                        addFile.write(LAT + "," + LON + "," + TOTAL_PRCP + "\n")


                    else:
                        addFile.write(LAT_A + "," + LON_A + "," + TOTAL_PRCP_A + "\n")
                        addFile.write(LAT_B + "," + LON_B + "," + TOTAL_PRCP_B + "\n")

This isn't really working and I'm finally giving up on manually trying on my end... There must be a pythonic, elegant way to perform this. Any help is EXTREMELY appreciated!

È stato utile?

Soluzione

It's simpler to use a defaultdict to hold the cumulative sums of precipitations. The keys to this dict will be ordered pairs of latitude and longitude. This does the trick:

from collections import defaultdict

files = ['sunday.txt', 'monday.txt', 'tuesday.txt', 'wednesday.txt', 
         'thursday.txt', 'friday.txt', 'saturday.txt'
]

totals = defaultdict(float)

for fn in files:
    with open(fn) as f:
        for line in f.readlines():
            lat, long, prec = line.split()  # strings
            totals[(lat, long)] += float(prec)

# See what we have:
import pprint
pprint.pprint(totals)

Here's some sample data:

monday.txt
----------
22.0 110.4 3.2
23.0 121.0 1.0
23.0 122.0 0.2
24.0 122.0 1.0

tuesday.txt
-----------
22.0 110.4 1.0

wednesday.txt
-------------
23.0 122.0 0.3

thursday.txt
------------
24.0 122.0 1.0
25.0 1.0 1.0

friday.txt
----------
24.0 122.0 1.1

saturday.txt
------------
23.0 121.0 10.5

and here's the output of the above code with these files:

{('22.0', '110.4'): 5.4,
 ('23.0', '121.0'): 11.5,
 ('23.0', '122.0'): 0.6000000000000001,
 ('24.0', '122.0'): 3.1,
 ('25.0', '1.0'): 1.0}

I haven't taken the extra step of writing the aggregated data to a file of the same format -- I'll leave that as an exercise ;)

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top