subset structured array based on column names

https://stackoverflow.com/questions/23571368

arrays
python
numpy
python-2.7
subset

19-07-2023
|

Question

I have a structured array created from a CSV file.

I have many fields/columns. I would like to create a subset array.

z=mydata[['z1','z2','z3']] will do the trick, but sometimes z goes from z1-z2 only or z1-z10.

Is there an easy way to subset all columns where the column name starts with z regardless of how many columns are in the data? Obviously the resultant array will have different number of columns but that's fine.

Solution

Not very pretty, but you can do the following:

z = mydata[[x for x in a.dtype.names if 'z' in x]]

Effectively you loop through all of the column names and check if the column name matches the criteria. If there is a better way I would be very interested as I do similar operations in pandas using the pd.DataFrame.columns attribute.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow