Domanda

I want to extract top four hits from a large XML file containing results of tblastn search of multiple protein queries on my local nucleotide database. The problem however is that my blast settings has some queries resulting into less than four hits so that when I run this code:

> from Bio.Blast import NCBIXML 
   with open('/home/edson/ungulate/tblastn_result_test_xml') as tblastn_file: 
>  tblastn_records = NCBIXML.parse(tblastn_file) 
       for tblastn_record in tblastn_records:
>         if tblastn_record.alignments:
>             print tblastn_record.alignments[0].title
>             print tblastn_record.alignments[0].hsps[0]
>             print tblastn_record.alignments[1].title
>             print tblastn_record.alignments[1].hsps[0]
>             print tblastn_record.alignments[2].title
>             print tblastn_record.alignments[2].hsps[0]
>             print tblastn_record.alignments[3].title
>             print tblastn_record.alignments[3].hsps[0]

It runs but after some runs it says:

Traceback (most recent call last):   File
 "/home/edson/tblastn_parser_test.py", line 8, in <module>
     print tblastn_record.alignments[0].title IndexError: list index out of range

So how do I modify this script to print results of top four alignments? Looking forward for a response, and any help will be appreciated.

È stato utile?

Soluzione

How about something like this?

from Bio.Blast import NCBIXML

with open('/home/edson/ungulate/tblastn_result_test_xml') as tblastn_file: 
    tblastn_records = NCBIXML.parse(tblastn_file) 
    for tblastn_record in tblastn_records:
        for alignment in record.alignments[:4]:
            print alignment.title
            print alignment.hsps[0]

I'm not familiar with biopython, but the docs[1] say that alignments is a list of Alignment objects. This example takes a slice of the first four elements in the list. If there are fewer than four it will just take whatever is there.

[1] - http://biopython.org/DIST/docs/api/Bio.Blast.Record.Blast-class.html

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top