A file object is a context manager, so you can use the with statement to automatically close the files:
with open("testseqs.fasta", "r") as fasta, open("cleanseqs.tax", "w") as tax:
while True:
...
Question
I am trying to run a script (see below) to read in a fasta file and output a taxonomy file (printing only the sequence header without the '>' character) but I keep getting a syntax error which I have not been able to resolve. As a result the script creates the cleanseqs.tax file but the file is blank. Could anyone help?
Thank you!
>>> Fasta = open("testseqs.fasta", "r")
>>> Tax = open("cleanseqs.tax", "w")
>>> while 1:
... SequenceHeader= Fasta.readline()
... Sequence= Fasta.readline()
... if SequenceHeader == '':
... break
... Tax.write(SequenceHeader.replace('>', ''))
... Fasta.close()
File "<stdin>", line 7
Fasta.close()
^
SyntaxError: invalid syntax
>>> Tax.close()
Solution 3
A file object is a context manager, so you can use the with statement to automatically close the files:
with open("testseqs.fasta", "r") as fasta, open("cleanseqs.tax", "w") as tax:
while True:
...
OTHER TIPS
Add in an extra line, as ... Fasta.close()
isn't necessarily the end of the while loop. It could take in another keyword, like else
. Adding another line implies the end of the while loop.
Or did you mean to indent Fasta.close()
?
The interpreter thinks you are trying to put the Fasta.close()
call inside the while
-loop, but the line is improperly indented. Just press enter when you want to end the while
block.
Also it would be ideal here to use the with
-statement so you can get rid of the close()
-calls entirely.
I had the same problem, my program would write in a file but file turned out to be empty.
Correct your indent and then close the file, content will be present in file after closing it in python.
As you requested, here are a couple of examples with existing FASTA parsers:
Example file:
$ cat test.fasta
>header1 description1
SEQUENCE
>header2 description2
SEQUENCESEQUENCE
With BioPython.SeqIO
:
from Bio import SeqIO
reader = SeqIO.parse('test.fasta', 'fasta')
with open('biopython.tax', 'w') as out:
for prot in reader:
out.write(prot.description + '\n')
reader.close()
Output:
$ cat biopython.tax
header1 description1
header2 description2
With pyteomics.fasta
(pyteomics
is a library developed by me and my colleagues):
from pyteomics import fasta
with open('pyteomics.tax', 'w') as out, fasta.read('test.fasta') as reader:
for prot in reader:
out.write(prot.description + '\n')
Output:
$ cat pyteomics.tax
header1 description1
header2 description2