Question

I am trying to load a csv file consisting just from float types.

data = np.genfromtxt(self.file,dtype=float,delimiter=self.delimiter,names = True)

but this returns an array of tuples. Based on my search this should return tuples only for non-homogenous arrays. numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why?. When I remove the names=True, it really does return an 2d array. Is it possible to return an array with names as it is in the link?

Lines from the csv:

0 _id|1 age|2 unkown|3 male|4 female|5 match-start|6 score
8645632250|7744|0|1|0|1|10

(there is more columns, I just wrote the first six of them.)

I also used this code for better names of columns:

def obtain_data(self):
with open(self.file, 'r') as infile:
  first_line = infile.readline()
  labels = first_line.split('|')
  labels = list(map(trunc_before,labels))
  data = np.genfromtxt(self.file,dtype=float,delimiter=self.delimiter,names = labels,skip_header=1)
  return data,  np.asarray(labels)
Était-ce utile?

La solution

It sounds like you're asking whether it's possible to have a standard 2d array while also having named columns. It isn't. (At least not in the sense you seem to be asking.)

An "array with names" is a structured array -- it's an array of records (not really tuples), each of which has named fields. Think of it this way: the names aren't attached to the array, they're attached to the "tuples" -- the records. The fact that the data is of a homogenous type doesn't matter.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top