Question

Using PyDev with an eclipse environment for Python 2.7 on OSX. Trying to count the element in the array and sum up the elements in the array. Getting an error on the index.

import numpy as np
import os
import sys

csv_file_object = fileName = os.path.join('train.csv')
print('Directory separator on your platform ({}): {}'.format(sys.platform, os.sep))

data=[]
for row in csv_file_object:
    data.append(row)
data = np.array(data)

number_passengers = np.size(data[0::,0].astype(np.float))
number_survived = np.sum(data[0::,0].astype(np.float))
proportion_survivors = number_survived / number_passengers

Traceback (most recent call last):
  File "/Users/scdavis6/Documents/Kaggle/Titanic1.py", line 14, in <module>
    number_passengers = np.size(data[0::,0].astype(np.float))
IndexError: too many indices

Let me know if I can provide additional information.

Thank you.


Update: I made the edits, but got another error about the module not being callable:

Traceback (most recent call last):
  File "/Users/scdavis6/Documents/Kaggle/Titanic1.py", line 5, in <module>
    csv_file_object = fileName = os.path('train.csv')
TypeError: 'module' object is not callable

Update: I changed os.path('train.csv') to os.path.join('train.csv'), but got another error about not finding the .csv file.

Traceback (most recent call last):
  File "/Users/scdavis6/Documents/Kaggle/Titanic1.py", line 9, in <module>
    with open(fileName) as f:
IOError: [Errno 2] No such file or directory: 'train.csv'

Here's the absolute path for the .csv file and the python scripts.

import os
os.path.abspath("/Users/scdavis6/Desktop/train.csv")

'/Users/scdavis6/Desktop/train.csv'

import os
os.path.abspath("/Users/scdavis6/Documents/Kaggle/Titanic1.py")

'/Users/scdavis6/Documents/Kaggle/Titanic1.py'

Was it helpful?

Solution

Assuming that this is your actual code, the problem is that you never open the file. Your csv_file_object is still just the fileName, and thus your data is made up of the characters of that file name, resulting in a 1D numpy array.

Instead, you should open the file and create a csv.reader for it.

import csv
with open(fileName) as f:
    reader = csv.reader(f)
    data=[]
    for row in reader:
        data.append(row)
    data = np.array(data)

Or shorter: data = np.array([row for row in csv.reader(f)])


Update: The new error you are getting is probably due to you accidentally changing os.path.join('train.csv') to os.path('train.csv'), i.e., instead of calling the join function from the os.path module, you are (trying to) call the module itself.


Update: It seems your train.csv file is not in the same directory as your Python script, thus the script won't find the file if you just use the filename. You have to use the absolute path together with the filename:

fileName = os.path.join('/Users/scdavis6/Desktop', 'train.csv')

Or just fileName = '/Users/scdavis6/Desktop/train.csv'. Alternatively, move your train.csv file to the same directory as your Python script. This might indeed be the better and more robust option, unless you are using this file in multiple scripts in different directories.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top