문제

I need your help. I have more than 40000 proteins in fasta file format.

First I want to write a function:

  • that is able to calculate the masses of the b- and y-ions
  • that creates a peptide database from the target proteins (mat-file)
  • that creates a peptide database coming from the decoy proteins (mat-file)

Then, I want to:

  • load the observed data
  • filter the peptide databases for candidate peptides given a certain ppm accuracy
  • write a function that scores the candidate peptides against the observed data
  • Come up with a thresholding scheme to discern bonafide peptide spectrum matches from the bogus ones
도움이 되었습니까?

해결책

To get started with, FASTA is text file format. To write text files check MATLAB documentation of fopen, fprintf and fclose. To load the text from the data files you've written you can use fopen, fscanf and fclose. Actually, MATLAB has fastainfo, fastaread and fastawrite too. You should check MATLAB documentation of these commands and of other FASTA-related and protein analysis related commands which could be useful for you (I haven't done protein analysis, so I can't say which are the ones you'll need).

Further, you are asking so many things in one question that it's not possible to answer them all, because your question IMHO is kind of "How I write my entire program?". But I think that you could get started with the commands I have listed, and when you have some code written and a well-defined problem that you've attempted to solve yourself, then you could post a new question about it, with the relevant parts of your code.

다른 팁

MATLAB's Bioinformatics Toolbox contains building block routines that you can put together to achieve this. If you have a specific problem when putting them together, post the specific question.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top