Pregunta

I have been trying to write a code using BioPerl that will query Genbank for a specific protein and then print the results to a fasta file. So far the code I have works and I can print the results to the screen, but not to a file. I have done lots of researching on the BioPerl website and other sources (CPAN, PerlMonks, etc.) but I have not found anything that can solve my problem. I understand how to read something from a file and then print the output to a new file (using SeqIO), but the problem I am having seems to be that what I want the program to read is not stored in a text or FASTA file, but is the result of a database query. Help? I am very much a beginner, new to Perl/BioPerl and programming in general.

Here is the code I have so far:

#!usr/bin/perl
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
use Bio::Seq;

$query = "Homo sapiens[ORGN] AND TFII-I[TITL]";

$query_obj = Bio::DB::Query::GenBank->new(-db => 'protein', -query => $query);

$gb_obj = Bio::DB::GenBank->new;

$stream_obj = $gb_obj->get_Stream_by_query($query_obj);
while ($seq_obj = $stream_obj->next_seq) 
{print $seq_obj->desc, "\t", $seq_obj->seq, "\n";
}

So, what I want to do in the last line is instead of printing to the screen, print to a file in fasta format.

Thanks, ~Jay

¿Fue útil?

Solución

Your code was actually quite close, you are returning a Bio::Seq object in your loop, and you just need to create a Bio::SeqIO object that can handle those objects and write them to a file ("myseqs.fasta" is the file in the example).

#!usr/bin/env perl                                                                                                                                                               

use strict;
use warnings;
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
use Bio::SeqIO;

my $query = "Homo sapiens[ORGN] AND TFII-I[TITL]";

my $query_obj = Bio::DB::Query::GenBank->new(-db => 'protein', -query => $query);

my $gb_obj = Bio::DB::GenBank->new(-format => 'fasta');

my $stream_obj = $gb_obj->get_Stream_by_query($query_obj);
my $seq_out = Bio::SeqIO->new(-file => ">myseqs.fasta", -format => 'fasta');

while (my $seq_obj = $stream_obj->next_seq) {
    $seq_out->write_seq($seq_obj);
}

Also, note that I added use strict; and use warnings; to the top of the script. That will help solve most "Why isn't this working?" type of questions by generating diagnostic messages, and it is a good idea to include those lines.

Otros consejos

Assuming you have the data to make a fasta seq (which it seems you do), can you use Bio::FASTASequence module the seq2file function? I've never used it nor am I bioinformatic expert, just saw the option there and thought it might be useful to you.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top