Pull Alignment Character Position

https://stackoverflow.com/questions/10872710

12-06-2021
|

문제

I use pairwise align to get the following:

> alignment <-pairwiseAlignment(pattern = canonical.protein, subject=protein.extracted)
> alignment
Global PairwiseAlignedFixedSubject (1 of 1)
pattern: [448]          DDWEIPDGQITVGQRIGSGSFGTVYKGKWHGDVAVKMLNVTAPTPQQLQAFKNEVGV...FMVGRGYLSPDLSKVRSNCPKAMKRLMAE  CLKKKRDERPLFPQILASIELLARSLPK 
subject:   [1]     DDWEIPDGQITVGQRIGSGSFGTVYKGKWHGDVAVKMLNVTAPTPQQLQAFKNEVGV...FMVGRGYLSPDLSKVRSNCPKAMKRLMAECLKKKRDERPLFPQILASIELLARSLPK 
score: -912.3752

I can then use:

toString(pattern(alignment))
toString(subject(alignment))

to get the full string sequence for both the pattern and the subject. However, how do I get the number 448 and 1 out of the object as an integer? I need to use these numbers but there doesn't seem to be a way to get at them.

해결책

I believe these are the starts of the alignments, so

start(pattern(alignment))

Your question would be clearer with a fully reproducible example, e.g.,

library(Biostrings)
example(pairwiseAlignment)
aln <- pairwiseAlignment(AAString("PAWHEAE"), AAString("HEAGAWGHEE"),
    substitutionMatrix = "BLOSUM50", gapOpening = 0, gapExtension = -8)

Then

> aln
Global PairwiseAlignedFixedSubject (1 of 1)
pattern: [1] PA--W-HEAE
subject: [2] EAGAWGHE-E
score: 1
> start(subject(aln))
[1] 2

Also, the Bioconductor mailing list is more appropriate for these questions; no subscription required.

다른 팁

Since you can make a string out of the alignment you can use R's string functions. You can do substr(toString(pattern(alignment)), 448, 448) to get the 448th character. I'm not familiar with that library so there might be an inbuilt way that I don't know of. See http://www.statmethods.net/management/functions.html for string functions in R.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow