Domanda

I have a dynamic multidimensional array, that can have a different number of columns each time. The user is asked to select which columns to extract from a file with N-columns, and based on this number, a multidimensional array 'ARRAY_VALUES' is created.

import numpy as num

DIRECTORY = '/Users/user/Desktop/'
DATA_DIC_FILE = "%sOUTPUT_DIC/OUTPUT_DICTIONARIES.txt" %(DIRECTORY)
Choice = str( raw_input( 'Which columns do you want to use (separated by a comma):\t' ) ).split(',')
# Input something like this: 1,2,3,4

String_choice = []
PCA_INDEX     = []
Columns = len(Choice)

PCA_INDEX = {}
# PCA_INDEX is a dictionary that the key is a string whose value is a float number.
PCA_INDEX['any_string'] = float_number # The dictionary has about 50 entries. 

ARRAY_VALUES = [ [] for x in xrange( Columns) ]
""" Creating the N-dimensional array that will contain the data from the file """
""" This list has the form ARRAY_VALUES = [ [], [], [], ... ] for n-repetitions. """

ARRAY_VALUES2 = ARRAY_VALUES

lines = open( DATA_DIC_FILE ).readlines() #Read lines from the file
for i in range( 0, len(ARRAY_VALUES) ):
    ARRAY_VALUES[i] = num.loadtxt( fname = DATA_DIC_FILE, comments= '#', delimiter=',', usecols = [ int( PCA_INDEX[i] ) ], unpack = True )
    """ This saves the lists from the file to the matrix 'ARRAY_VALUES' """

Now that I have the multidimensional array in the form of ARRAY_VALUES = [[], [], ...] for n-columns.

I want to eliminate the corresponding rows from each of the columns if any of the values are 'inf's. I tried to use the following code, but I don't know how to make it dynamic for the number of columns:

for j in range(0, len(ARRAY_VALUES)):
    for i in range(0, len(ARRAY_VALUES[0])):
        if num.isinf( ARRAY_VALUES[j][i] ) or num.isinf( ARRAY_VALUES[]): # This is where the problem is.
        # if num.isinf( ARRAY_VALUES[0][i] ) or num.isinf(ARRAY_VALUES[1][i] or ... num.isinf(ARRAY_VALUES[last_column][i]: 
            continue
        else:
            ARRAY_VALUES2[j].append( ARRAY_VALUES[j][i] ) #Save the values into ARRAY_VALUES2. 

Can anyone help me out and tell me how to do this part:

# if num.isinf( ARRAY_VALUES[0][i] ) or num.isinf(ARRAY_VALUES[1][i] or ... num.isinf(ARRAY_VALUES[last_column][i]:

for a multi-dimensional array with n-columns, so that the output is like the following:

ARRAY_VALUES  = [ [8, 2, 3  , inf, 5],
                  [1, 9, inf,  4 , 5],
                  [7, 2, inf, inf, 6] ]

ARRAY_VALUES2 = [ [8, 2, 5],
                  [1, 9, 5],
                  [7, 2, 6] ]

--Thanks!

È stato utile?

Soluzione

>>> a = np.array([[8, 2, 3  , np.inf, 5],[1, 9, np.inf,  4 , 5],[7, 2, np.inf, n
p.inf, 6]])
>>> col_mask = [i for i in range(ncols) if not any(a[:,i] == np.inf)]
>>> print a[:,col_mask]
[[ 8.  2.  5.]
 [ 1.  9.  5.]
 [ 7.  2.  6.]]

first use a numpy.array if you arent already.

then we iterate over each column and check for any np.infs to create a mask of allowable columns

lastly we just use numpy's column indexing to access only our columns of interest

as DSM points out you can create the mask with just numpy and avoid the list comprehension

col_mask = np.isfinite(a).all(axis=0)
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top