Question

I have some python code that parse the csv file. Now our vendor decide to change the data file to gzip csv file. I was wondering what's the minimal/cleanest code change I have to make. Current function:

def load_data(fname, cols=()):
    ... ...
    with open(fname) as f:
        reader = csv.DictReader(f)
        ... ... 

I don't want to duplicate the code to load_data2(), and change the with statement to, thought it works perfectly.

with gzip.open(fname) as f:

How can I factor out the with statement?

def load_data(fname, cols=()):
    ... ...
    if fname.endswith('.csv.gz'):
        with gzip.open(fname) as f:
    else:
        with open(fname) as f:

        reader = csv.DictReader(f)
        ... ... # code to parse
Was it helpful?

Solution 2

The universal approach is to dynamically choose an opener:

openers = {
    'http': urllib2.urlopen,
    '.csv.gz': gzip.open
    '.csv': open
}

resource_type = get_resource_type(resource) # determine the type of the resource

with openers[resource_type](resource) as f:
    # do stuff ...

That way, you can painlessly add more openers, when needed. Here is another example of the factory method design pattern.

OTHER TIPS

You can do this by assigning the function you want to use to open the file to a different variable, depending on the properties of the file name:

opener = gzip.open if fname.endswith('.csv.gz') else open
with opener(fname) as f:
    ... # code to parse
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top