Question

I am using pandas and python to process multiple files with different column names for columns with the same data.

dataset = pd.read_csv('Test.csv', index_col=0)

cols= dataset.columns

I have the different possible column titles in a list.

AddressCol=['sAddress','address','Adrs', 'cAddress']

Is there a way to normalize all the possible column names to "Address" in pandas so I use the script on different files?

Without pandas I would use something like a double for loop to go through the list of column names and possible column names and a if statement to extract out the whole array.

Was it helpful?

Solution

You can use the rename DataFrame method:

dataset.rename(columns={typo: 'Address' for typo in AddressCol}, inplace=True)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top