Question

Il est bien connu que la science nous a donné grande quantité de données accessibles gratuitement, comme http://www.1000genomes.org et http://www.ncbi.nlm.nih.gov/genbank. Comment peut-on jouer avec les données et appliquer l'apprentissage scientifique / données machine à elle? Ce qui pourrait être quelques idées?

Mes propres idées:

  • visualisation des données biologiques
  • prédiction de gènes par-markov-modèle caché

plus?

Était-ce utile?

La solution

  • Determine the function of genes and the elements that regulate genes throughout the genome.
  • Find variations in the DNA sequence among people and determine their significance. The most common type of genetic variation is known as a single nucleotide polymorphism or SNP (pronounced “snip”). These small differences may help predict a person’s risk of particular diseases and response to certain medications.
  • Discover the 3-dimensional structures of proteins and identify their functions.
  • Explore how DNA and proteins interact with one another and with the environment to create complex living systems.
  • Develop and apply genome-based strategies for the early detection, diagnosis, and treatment of disease.
  • Sequence the genomes of other organisms, such as the rat, cow, and chimpanzee, in order to compare similar genes between species.
  • Develop new technologies to study genes and DNA on a large scale and store genomic data efficiently.
  • Continue to explore the ethical, legal, and social issues raised by genomic research.
  • Source

Autres conseils

You may build models to classify genomes by population. Run unsupervised learning (clustering) to see if populations are reconstructed in the model. Build models to infer missing genotypes

To do a Scalable DNA analysis you may check Adam software based on Apache Spark

Licencié sous: CC-BY-SA avec attribution
scroll top