Question

I am having trouble calling an EMBOSS program (which runs via command line) called sixpack through Python.

I run Python via Windows 7, Python version 3.23, Biopython version 1.59, EMBOSS version 6.4.0.4. Sixpack is used to translate a DNA sequence in all six reading frames and creates two files as output; a sequence file identifying ORFs, and a file containing the protein sequences.

There are three required arguments which I can successfully call from command line: (-sequence [input file], -outseq [output sequence file], -outfile [protein sequence file]). I have been using the subprocess module in place of os.system as I have read that it is more powerful and versatile.

The following is my python code, which runs without error but does not produce the desired output files.

from Bio import SeqIO
import re
import os
import subprocess

infile = input('Full path to EXISTING .fasta file would you like to open: ')
outdir = input('NEW Directory to write outfiles to: ')
os.mkdir(outdir)
for record in SeqIO.parse(infile, "fasta"):

    print("Translating (6-Frame): " + record.id)

    ident=re.sub("\|", "-", record.id)

    print (infile)
    print ("Old record ID: " + record.id)
    print ("New record ID: " + ident)

    subprocess.call (['C:\memboss\sixpack.exe', '-sequence ' + infile, '-outseq ' + outdir + ident + '.sixpack', '-outfile ' + outdir + ident + '.format'])

    print ("Translation of: " + infile + "\nWritten to: " + outdir + ident)
Was it helpful?

Solution

Found the answer.. I was using the wrong syntax to call subprocess. This is the correct syntax:

subprocess.call (['C:\memboss\sixpack.exe', '-sequence', infile, '-outseq', outdir + ident + '.sixpack', '-outfile', outdir + ident + '.format'])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top