If your Seq
object has an assigned alphabet, you can check if that alphabet is a protein alphabet:
from Bio.Seq import Seq
from Bio.Alphabet import IUPAC, ProteinAlphabet
my_prot = Seq("TGEKPYVCQECGKAFNCSSYLSKHQR", alphabet=IUPAC.IUPACProtein())
print isinstance(my_prot.alphabet, ProteinAlphabet)
However, if you don't have the alphabet known, you'll have to employ some heuristics to guess whether or not it's a protein sequence. This could be as easy as checking if the sequence is entirely "ATC[GU]", or if it employs other letter codes.
But this isn't perfect. For instance, the sequence "ATCG" could be alanine, threonine, cysteine, glycine (i.e. a protein), or it could be adenine, thymine, cytosine, guanine (DNA). Similarly, "ACG" could be a protein, RNA, or DNA. It's technically impossible to be sure that a sequence is DNA, and not a protein sequence. However, if you have a SeqRecord
or other context for the Seq
, you may be able to check if it's a protein sequence.