Read CSV and Separate by column

https://stackoverflow.com/questions/11126012

16-06-2021
|

Question

Brand new to Python (and programming in general), if this is simple and/or answered somewhere I didn't find, feel free to harass me in typical forum fashion.

I've got a bunch of CSVs, each containing 10 XY coordinates like this:

10,5
2,4
5,6 
7,8
9,12
3,45
2,4
6,5
0,3 
5,6

I'm looking to separate the X coordinates and Y coordinates into two seperate lists, so that I can subtract a value from each value in a given list. For example, subtracting 5 from every value in the X coordinate list and 3 from every value in the Y coordinate list. I'm then going to take the abs() of each value and find the minimum. Once those minimums are found, I want to add the lists together so that each value is added to it's counterpart

IE) if the absolute values of X were something like

4
5
....

and Y something like

6
7
....

I'd want to add 4 and 6, then 5 and 7, etc.

To separate them, I tried

import csv
filein = open("/path/here")
reader = csv.reader(filein, skipinitialspace = True)
listofxys = []
for row in reader:
    listofxys.append(row)

Xs = listofxys.pop(0) # to pop all the X's

Ys = listofxys.pop() # to pop all the Y's

But instead of all the leading values, it provides the first XY pair. What am I doing wrong here?

The eventual goal is to find the closest point to an XY coordinate, so if this is a bad way to go about it, feel free to steer me in another direction.

Thanks in advance!

Solution

It's worth noting that you should try to use the with statement when opening files in Python. This is both more readable and removes the possibility of a file being left unclosed (even when exceptions occur).

Your actual problem comes in that you are not doing what you want to do.

reader = csv.reader(filein, skipinitialspace = True)
listofxys = []
for row in reader:
    listofxys.append(row)

All this does is reader = list(csv.reader(filein, skipinitialspace = True)) in a very inefficient manner.

What you want to do is use the zip() builtin to take a list of pairs and turn it into two lists. You do this with the star operator:

import csv

with open("test") as filein:
    reader = csv.reader(filein, skipinitialspace = True)
    xs, ys = zip(*reader)

print(xs)
print(ys)

Which gives:

('10', '2', '5', '7', '9', '3', '2', '6', '0', '5')
('5', '4', '6', '8', '12', '45', '4', '5', '3', '6')

Do note the fact these values are strings. If you want to have them as numbers, you will want to use csv.QUOTE_NONNUMERIC, e.g: reader = csv.reader(filein, quoting=csv.QUOTE_NONNUMERIC, skipinitialspace = True)

Which gives:

(10.0, 2.0, 5.0, 7.0, 9.0, 3.0, 2.0, 6.0, 0.0, 5.0)
(5.0, 4.0, 6.0, 8.0, 12.0, 45.0, 4.0, 5.0, 3.0, 6.0)

OTHER TIPS

Sounds like you're looking for the zip function, documented here:

http://docs.python.org/library/functions.html#zip

    import os,csv, numpy, scipy
    from numpy import *
    f= open('some.csv', 'rb') 
    reader = csv.reader(f, delimiter=',')
    header = reader.next()
    zipped = zip(*reader)
    print( zipped[1] ) # is the 2nd column of the csv file

HTH

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow