Don't use csv
to read the data into a NumPy array. Use numpy.genfromtxt
; using dtype=None
will cause genfromtxt
to make an intelligent guess at the dtypes for you. By doing it this way you won't have to manually convert strings to floats.
data[0::, 0]
just gives you the first column of data
.
data[:, 0]
would give you the same result.
The error message
TypeError: list indices must be integers, not tuple
suggests that for some reason your data
variable might be holding a list rather than a ndarray. For example, the same Exception can produced like this:
In [73]: data = [1,2,3]
In [74]: data[1,2]
TypeError: list indices must be integers, not tuple
I don't know why that is happening, but if you post a sample of your CSV we should be able to help fix that.
Using np.genfromtxt
, your current code could be simplified to:
import numpy as np
filename = '/Users/scdavis6/Documents/Kaggle/train.csv'
data = np.genfromtxt(filename, delimiter=',', skiprows=1, dtype=None)
number_passengers = np.size(data, axis=0)