Question

Input:

import pandas as pd
df=pd.read_excel("OCRFinal.xlsx")
df['OCR_Text']=df['OCR_Text'].str.replace(r'\W+'," ")
print(df['OCR_Text'])

Output:

The excel removes all the special characters along with the space. But i dont want space characters to be removed

Was it helpful?

Solution

import pandas as pd
df=pd.read_excel("OCRFinal.xlsx")

whitespace = "\r\n\t"

df['OCR_Text']=df['OCR_Text'].apply(lambda x: x.strip(whitespace))
print(df['OCR_Text'])

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top