Question

I am working on a housing dataset. In a list of columns (Garage, Fireplace, etc), I have values called NA which just means that the particular house in question does not have that feature (Garage, Fireplace). It doesn't mean that the value is missing/unknown. However, Python interprets this as NaN, which is wrong. To come across this, I want to replace this value NA with XX to help Python distinguish it from NaN values. Because there is a whole list of them, I want use a for loop to accomplish this in a few lines of code:

na_data = ['Alley', 'BsmtQual', 'BsmtCond', 'BsmtExposure', 'BsmtFinType1', 'BsmtFinType2', 'FireplaceQu', 'GarageType',
           'GarageFinish', 'GarageQual', 'GarageCond', 'PoolQC', 'Fence', 'MiscFeature']

for i in range(len(na_data)):
    train[i] = train[i].fillna('XX')

I know this isn't the correct way of doing it as it is giving me a KeyError: 0. This is kinda like a pseudocode way of doing it to visualize what I'm trying to accomplish. What is the way to automate fillna('XX') on this list of columns?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top