I'd avoid regex entirely. You can use str.translate
to remove the characters you don't want.
from string import ascii_letters
removechars = ''.join(set(ascii_letters) - set('ACTGNU'))
newFastA = self.fastAsequence.translate(None, removechars)
demo:
dna = 'ACTAGAGAUACCACG this will be removed GNUGNUGNU'
dna.translate(None, removechars)
Out[6]: 'ACTAGAGAUACCACG GNUGNUGNU'
If you want to remove whitespace too, you can toss string.whitespace
into removechars
.
Sidenote, the above only works in python 2, in python 3 there's an additional step:
from string import ascii_letters, punctuation, whitespace
#showing how to remove whitespace and punctuation too in this example
removechars = ''.join(set(ascii_letters + punctuation + whitespace) - set('ACTGNU'))
trans = str.maketrans('', '', removechars)
dna.translate(trans)
Out[11]: 'ACTAGAGAUACCACGGNUGNUGNU'