문제

Is there any way to clean a text from whitespaces and dots, commas without NLTK, but especially by regular expressions?

도움이 되었습니까?

해결책

If I have understood your question you can try this code

import re

text = "Split.this,text in seven.separate,words"

myexp=re.compile(r'[\s.,]')

print myexp.split(text)

that gives you this output

['Split', 'this', 'text', 'in', 'seven', 'separate', 'words']
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top