문제

I am using biopython to do something similar to this, Sort rps-blast results by position of the hit but want to join or concatenate local hits to have contiguous stretch of queries and subject hits.

My code:

for record in records:
   for alignment in record.alignments:
                hits = sorted((hsp.query_start, hsp.query_end, hsp.sbjct_start, hsp.sbjct_end, alignment.title, hsp.query, hsp.sbjct)\
                               for hsp in alignment.hsps)
                for q_start, q_end, sb_start, sb_end, title, query, sbjct in hits:
                      print title
                      print 'The query starts from position: ' + str(q_start)
                      print 'The query ends at position: ' + str(q_end)
                      print 'The hit starts at position: ' + str(sb_start)
                      print 'The hit ends at position: ' + str(sb_end)
                      print 'The  query is: ' + query
                      print 'The hit is: ' + sbjct

This would give sorted results as so:

Species_1
The query starts from position: 1
The query ends at position: 184
The hit starts at position: 1
The hit ends at position: 552
The query is: #######query_seq
The hit is: ######### hit_seq
Species_1
The query starts from position: 390
The query ends at position: 510
The hit starts at position: 549
The hit ends at position: 911
The query is: #######query_seq
The hit is: ######### hit_seq
Species_1
The query starts from position: 492
The query ends at position: 787
The hit starts at position: 889
The hit ends at position: 1776
The query is: #######query_seq
The hit is: ######### hit_seq

This is all fine but I want go the next logical step and that is to concatenate all the three sub_queries and sub-hits shown here (the number of hits do vary) to get complete query and subject sequences. What could be the way forward?

도움이 되었습니까?

해결책

Okk, so I am giving you a sample solution. Hope, it will help!

You can create an empty variable outside of the loop and concatenate the query string to that variable. Here is an edit on your given code:

expected_query_seq = ""
for record in records:
   for alignment in record.alignments:
                hits = sorted((hsp.query_start, hsp.query_end, hsp.sbjct_start, hsp.sbjct_end, alignment.title, hsp.query, hsp.sbjct)\
                               for hsp in alignment.hsps)
                for q_start, q_end, sb_start, sb_end, title, query, sbjct in hits:
                      print title
                      print 'The query starts from position: ' + str(q_start)
                      print 'The query ends at position: ' + str(q_end)
                      print 'The hit starts at position: ' + str(sb_start)
                      print 'The hit ends at position: ' + str(sb_end)
                      print 'The  query is: ' + query
                      print 'The hit is: ' + sbjct

                      expected_query_seq += str(query[q_start:q_end])
print expected_query_seq
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top