getting a gene sequence from entrez using biopython

Question 1

first with the gene name eg: ATK1

item = 'ATK1'
animal = 'Homo sapien' 
search_string = item+"[Gene] AND "+animal+"[Organism] AND mRNA[Filter] AND RefSeq[Filter]"

Now we have a search string to seach for ids

handle = Entrez.esearch(db="nucleotide", term=search_string)
record = Entrez.read(handleA)
ids = record['IdList']

this returns ids as a list if and if no id found it's []. Now lets assume it return 1 item in the list.

seq_id = ids[0] #you must implement an if to deal with <0 or >1 cases
handle = Entrez.efetch(db="nucleotide", id=seq_id, rettype="fasta", retmode="text")
record = handleA.read()

this will give you a fasta string which you can save to a file

out_handle = open('myfasta.fasta', 'w')
out_handle.write(record.rstrip('\n'))

Question 2

Looking at section 8.3 of the tutorial, there appears to be a function that will allow you to search for terms and get the corresponding IDs (I know nothing about this library and even less about biology, so this will potentially be completely wrong :) ).

>>> handle = Entrez.esearch(db="nucleotide",term="Cypripedioideae[Orgn] AND matK[Gene]")
>>> record = Entrez.read(handle)
>>> record["Count"]
'25'
>>> record["IdList"]
['126789333', '37222967', '37222966', '37222965', ..., '61585492']

From what I can tell, id refers to an actual ID number as returned by the esearch function (in the IdList attribute of the response). However if you use the term keyword, you can instead run a search and get the IDs of the matched items. Totally untested, but assuming the search supports boolean operators (it looks like AND works), you could try using a query like:

>>> handle = Entrez.esearch(db="nucleotide",term="ITGB1[Gene] OR RELA[Gene] OR NFKBIA[Gene]")
>>> record = Entrez.read(handle)
>>> record["IdList"]
# Hopefully your ids here...

To generate the term to insert, you could do something like this:

In [1]: l = ['ITGB1', 'RELA', 'NFKBIA']

In [2]: ' OR '.join('%s[Gene]' % i for i in l)
Out[2]: 'ITGB1[Gene] OR RELA[Gene] OR NFKBIA[Gene]'

The record["IdList"] could then be converted into a comma-delimited string and passed to the id argument in your original query by using something like:

In [3]: r = ['1234', '5678', '91011']

In [4]: ids = ','.join(r)

In [5]: ids
Out[5]: '1234,5678,91011'