Question

I have a file of nearly 1500 lines that contains symbols like ")(()(&&^%&^a%&#@%^%*&^" alongwith some two or three alphabets in the entire file.

How can i search for these alphabets in such file and display the found alphabets on the o/p screen.

Was it helpful?

Solution

Probably the fastest way would be to do

import re
with open("giantfile.txt") as infile:
    print(re.findall("[A-Za-z]+", infile.read()))

OTHER TIPS

Building on Tim's answer, you can use this code to save some memory.

import re

alphas = []
with open("giantfile.txt") as infile:
    for row in infile:
        alphas.extend(re.findall("[A-Za-z]+", row))

print alphas

Given this input file:

aaa
bbb
c12d

The output would be

['aaa', 'bbb', 'c', 'd']
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top