I'm using the following code for splitting up the dataset into a train and test data to save in a file;
import numpy as np
from sklearn.cross_validation import train_test_split
a = (np.genfromtxt(open('dataset.csv','r'), delimiter=',', dtype='int')[1:])
a_train, a_test = train_test_split(a, test_size=0.33, random_state=0)
c1 = open('trainfile.csv', 'w')
arr1 = str(a_train)
c1.write(arr1)
c1.close
c2 = open('testfile.csv', 'w')
arr2 = str(a_test)
c2.write(arr2)
c2.close
However I get the following output in the file;
trainfile.csv:
[[ 675847 0 0 ..., 0 0 3]
[ 74937 0 0 ..., 0 0 3]
[ 65212 0 0 ..., 0 0 3]
...,
[ 18251 0 0 ..., 0 0 1]
[1131828 0 0 ..., 0 0 1]
[ 14529 0 0 ..., 0 0 1]]
That is the entire content of trainfile. I'm facing the same issue with the output for testfile.csv as well. What I want is the entire training and test data to be stored inside the file instead of periods denoting extra data. Suggestions?