문제

in case feature encoding, if I'd like to encode my values based on my pre-determined dictionary, how do I do that?

For instance, say, I've values as [Red, Green, and Blue] and I want to encode them as [-1,0,1] -1 for red, 0 for Green, 1 for Blue... I'll apply it to my feature. I believe I can do it by mapping, apply method, not sure. But is there any better way to do that?

Column     expectedEncoding
Red             -1
Red             -1
Blue             1
Green            0
Red             -1
Blue             1

```
도움이 되었습니까?

해결책

Assuming you have a pandas DataFrame and one mapping per column, with all mappings stored in a 2-level dict where the keys of the first level correspond to the columns in the dataframe and the keys of the second level correspond to the categories:

{'fruit': {'banana': -1, 'apple': 1}, 'color': {'yellow': -1, 'red': 1}}

Then, you can do the following:

encoded_data = data.apply(lambda col: col.map(mappings[col.name]))

[EDIT] if have columns for which you don't have a mapping, you can do one of the following:

data.update(data[list(mappings)].apply(lambda col: col.map(mappings[col.name])))

or if you want it in a new dataframe (eg to keep the dataframe with the original values):

encoded_data = data.copy()
encoded_data.update(data[list(mappings)].apply(lambda col: col.map(mappings[col.name])))
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top