문제

We have a local EnsEMBL MySQL database containing annotated mosquito genomes.

The PyCogent cookbook states Here that you can access/query data from a local MySQL EnsEMBL database via the cogent.db.ensembl.HostAccount module. Here is the source code for PyCogent's ensembl-api.

But I cannot access the data as the functions assume a priori that I know the exact names of species (string) whose genomes I am trying to query... After hours of searching online, I would greatly appreciate if somebody could tell me how I can list the names of species (that PyCogent would understand) so that I can finally query the local database for the genome data.

This code shows my problem, note the commenting:

Release = 73

from cogent.db.ensembl import HostAccount, Genome

acc = HostAccount('localhost', 'username1', 'password1')  # login details to MySQL server

genome = Genome(Species='?????',Release=73,account=acc)   # Where can I find the available Species list so I can replace the '?????'
도움이 되었습니까?

해결책

After a useful tip from @dpryan79 (from Biostars) I looked at PyCogent's source code and it turns out the only way I could view species names available was by actually logging into the MySQL server and listing the databases, the database names themselves require a naming convention whereby the first two strings delimited by underscores (_) are the genus and species names respectively.

So by logging into the mysql server via terminal:

mysql -hlocalhost -uuser1 -ppass1

Then typing:

SHOW DATABASES 

I can see the species available by looking at the names of each database, specifically the first two strings delimited by underscores, e.g. the following databases listed:

anopheles_gambiae_core_1312_73_1
anopheles_arabeinsis_core_1312_73_1
anopheles_funestus_core_1312_73_1
anopheles_gambiaeM_core_1312_73_1

Suggest I have the following species available: anopheles gambiae, anopheles arabeinsis, anopheles funestus and anopheles gambiae type M

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top