Question

I'm parsing a csv file using Python.

The CSV file looks like:

value1,value2,value3(a,b,c)

The Python code:

with open(file_path, 'rb') as this_file:
  reader = csv.reader(this_file, delimiter=',')
  for row in reader:
    print row

Obviously the CSV reader interprets this as:

"value1","value2","value3(","a","b","c)"

What is the best way to stop Python breaking value2() into four values?

Thanks.

Was it helpful?

Solution

Here's a code that deals with the given example:

a='value1, value2, value3(a, b, c)'
split=a.split(', ')
result=[]
for ent in split:
    if ent.find('(', 0, len(ent))!=-1:
        temp=''
        for ent2 in split[split.index(ent):]:
            if ent2.find('(', 0, len(ent))!=-1:
                temp=temp+ent2
            else:
                temp=temp+','+ent2
                split.remove(ent2)
            #May need a check whether ) has not been reached yet, in which case don't add the items.
        result.append(temp)
    else:
        result.append(ent)

It will require some small checking if there exist some "normal" entries after the ones surrounded with the parentheses (as indicated in the comment), e.g.

a='value1, value2, value3(a, b, c)', 'value4'

Hope this helps. Apologies, I can't think of any way to use the in-built csv parser since your file is not, in fact, a "proper" csv...

OTHER TIPS

There is no easy way to break out of commas within values. There have been several questions about it, and many pointed to this post: https://stackoverflow.com/a/769713/2620328

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top