سؤال

In Pandas, is there an easy way to apply a function only on columns of a specific type?

In one example, I need to pre-process a dataframe with control characters before I save it to a csv file.

I currently do the following:

df[string_column] = df[string_column].apply(
                     lambda x: 
                     x.encode('ascii', errors='ignore').replace('\n',' ').replace('\t', ' '))

but this requires knowing what columns have strings.

What is an easy way to apply a function only on columns of a certain type?

هل كانت مفيدة؟

المحلول

Well, I think I would just make a list of the string columns based on the dtype (they will have object dtype). So something like the following:

>>> df = pd.read_csv(StringIO(data),header=True)
>>> print df

   A  B   C   D
0  1  a   6  ff
1  2  b   7  cc
2  3  c   8  dd
3  4  d   9  ee
4  5  e  10  gg

>>> print df.dtypes

A     int64
B    object
C     int64
D    object

And then you can get a list of object/str columns with something like the following:

>>> print df.dtypes[df.dtypes == 'object'].index.tolist()

['B', 'D']

And now you can use that list with an apply or whatever.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top