Question

I am trying to run a script (see below) to read in a fasta file and output a taxonomy file (printing only the sequence header without the '>' character) but I keep getting a syntax error which I have not been able to resolve. As a result the script creates the cleanseqs.tax file but the file is blank. Could anyone help?

Thank you!

>>> Fasta = open("testseqs.fasta", "r")
>>> Tax = open("cleanseqs.tax", "w")
>>> while 1:
...     SequenceHeader= Fasta.readline()
...     Sequence= Fasta.readline()
...     if SequenceHeader == '':
...             break
...     Tax.write(SequenceHeader.replace('>', ''))
... Fasta.close()
  File "<stdin>", line 7
    Fasta.close()
        ^
SyntaxError: invalid syntax
>>> Tax.close()
Was it helpful?

Solution 3

A file object is a context manager, so you can use the with statement to automatically close the files:

with open("testseqs.fasta", "r") as fasta, open("cleanseqs.tax", "w") as tax:
    while True:
        ...

OTHER TIPS

Add in an extra line, as ... Fasta.close() isn't necessarily the end of the while loop. It could take in another keyword, like else. Adding another line implies the end of the while loop.

Or did you mean to indent Fasta.close()?

The interpreter thinks you are trying to put the Fasta.close() call inside the while-loop, but the line is improperly indented. Just press enter when you want to end the while block.

Also it would be ideal here to use the with-statement so you can get rid of the close()-calls entirely.

I had the same problem, my program would write in a file but file turned out to be empty.

Correct your indent and then close the file, content will be present in file after closing it in python.

As you requested, here are a couple of examples with existing FASTA parsers:

Example file:

$ cat test.fasta 
>header1 description1
SEQUENCE
>header2 description2
SEQUENCESEQUENCE
  1. With BioPython.SeqIO:

    from Bio import SeqIO
    reader = SeqIO.parse('test.fasta', 'fasta')
    with open('biopython.tax', 'w') as out:
        for prot in reader:
            out.write(prot.description + '\n')
    reader.close()
    

    Output:

    $ cat biopython.tax 
    header1 description1
    header2 description2
    
  2. With pyteomics.fasta (pyteomics is a library developed by me and my colleagues):

    from pyteomics import fasta
    with open('pyteomics.tax', 'w') as out, fasta.read('test.fasta') as reader:
        for prot in reader:
            out.write(prot.description + '\n')
    

    Output:

    $ cat pyteomics.tax 
    header1 description1
    header2 description2
    
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top