I have a file of nearly 1500 lines that contains symbols like ")(()(&&^%&^a%&#@%^%*&^" alongwith some two or three alphabets in the entire file.

How can i search for these alphabets in such file and display the found alphabets on the o/p screen.

有帮助吗?

解决方案

Probably the fastest way would be to do

import re
with open("giantfile.txt") as infile:
    print(re.findall("[A-Za-z]+", infile.read()))

其他提示

Building on Tim's answer, you can use this code to save some memory.

import re

alphas = []
with open("giantfile.txt") as infile:
    for row in infile:
        alphas.extend(re.findall("[A-Za-z]+", row))

print alphas

Given this input file:

aaa
bbb
c12d

The output would be

['aaa', 'bbb', 'c', 'd']
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top