Question

I have the following function (from package seqinr):

translate(seq, frame = 0, sens = "F", numcode = 1, NAstring = "X", ambiguous = FALSE)

Shortly, it translates DNA sequences into protein sequences. I have problems with giving the seq argument. The documentation says:

seq = the sequence to translate as a vector of single characters in lower case letters

I store the DNA sequence in a data.frame (named here seq):

seq <- data.frame(geneSeq="ATGTGTTGGGCAGCCGCAATACCTATCGCTATATCTGGCGCTCAGGCTATCAGTGGTCAGAACACTCAAGCCAAAATGATTGCCGTTCAGACCGCTGCTGGTCGTCGTCAAGCTATGGAAATCATGAGGCAGACGAACATCCAGAATGCTGACCTATCGTTGCAAGCTCGAAGTAACCTTGAGAAAGCGTCCGCCGAGTTGACCTCACAGAACATGCAKAAGGTCCAAGCTATTGGGTCTATCCGAGCGGCTATCGGAGAAAGTATGCTTGAAGGTTCCTCAATGGACCGTATTAAGCGAGTCACAGAAGGACAGTTCATTCGGGAAGCCAATATGGTAACTGAGAACTATCGCCGTGACTACCAAGCAATCTTCGTACAGCAACTTGGTGGTACTCAAAGTGCTGCAAGTCAGATTGACGAAATCTATAAGAGCGAACAGAAACAGAAGAGTAAGCTACAGATGGTTCTGGACCCACTGGCTATCATGGGGTCTTCCGCTGCGAGTGCTTACGCATCCGATGCGTTCGACTCTAAGTTCACAACTAAGGCACCTATTGTTGCCGCTAAAGGAACCAAGACGGGGAGGTAA", stringsAsFactors=FALSE)

Every time I try to use the translate function, it returns the error:

Error in s2n(seq, levels = s2c("tcag")) : 
  sequence is not a vector of chars

I have tried the following, all give the above error:

trans<- seqinr::translate(tolower(seq[1,1]))
trans<- seqinr::translate(stringr::str_split(tolower(seq[1,1]), pattern=""))
trans<- seqinr::translate(as.character(stringr::str_split(tolower(seq[1,1]), pattern="")))

How can I transform my DNA sequence in a vector of single characters?

Was it helpful?

Solution

You could use strsplit:

strsplit("ABCD", "")
# [[1]]
# [1] "A" "B" "C" "D"

## your example:
seqinr::translate(strsplit(seq[1,1], "")[[1]])

OTHER TIPS

The first example in ?translate pretty much gives you your answer.

You don't need a data frame and can use s2c whose sole purpose in life is for "conversion of a string into a vector of chars":

geneSeq="ATGTGTTGGGCAGCCGCAATACCTATCGCTATATCTGGCGCTCAGGCTATCAGTGGTCAGAACACTCAAGCCAAAATGATTGCCGTTCAGACCGCTGCTGGTCGTCGTCAAGCTATGGAAATCATGAGGCAGACGAACATCCAGAATGCTGACCTATCGTTGCAAGCTCGAAGTAACCTTGAGAAAGCGTCCGCCGAGTTGACCTCACAGAACATGCAKAAGGTCCAAGCTATTGGGTCTATCCGAGCGGCTATCGGAGAAAGTATGCTTGAAGGTTCCTCAATGGACCGTATTAAGCGAGTCACAGAAGGACAGTTCATTCGGGAAGCCAATATGGTAACTGAGAACTATCGCCGTGACTACCAAGCAATCTTCGTACAGCAACTTGGTGGTACTCAAAGTGCTGCAAGTCAGATTGACGAAATCTATAAGAGCGAACAGAAACAGAAGAGTAAGCTACAGATGGTTCTGGACCCACTGGCTATCATGGGGTCTTCCGCTGCGAGTGCTTACGCATCCGATGCGTTCGACTCTAAGTTCACAACTAAGGCACCTATTGTTGCCGCTAAAGGAACCAAGACGGGGAGGTAA"

print(translate(s2c(geneSeq), frame = 0, sens = "F", numcode = 1, NAstring = "X", ambiguous = FALSE)

##   [1] "M" "C" "W" "A" "A" "A" "I" "P" "I" "A" "I" "S" "G" "A" "Q" "A" "I" "S" "G" "Q" "N" "T" "Q" "A" "K"
##  [26] "M" "I" "A" "V" "Q" "T" "A" "A" "G" "R" "R" "Q" "A" "M" "E" "I" "M" "R" "Q" "T" "N" "I" "Q" "N" "A"
##  [51] "D" "L" "S" "L" "Q" "A" "R" "S" "N" "L" "E" "K" "A" "S" "A" "E" "L" "T" "S" "Q" "N" "M" "X" "K" "V"
##  [76] "Q" "A" "I" "G" "S" "I" "R" "A" "A" "I" "G" "E" "S" "M" "L" "E" "G" "S" "S" "M" "D" "R" "I" "K" "R"
## [101] "V" "T" "E" "G" "Q" "F" "I" "R" "E" "A" "N" "M" "V" "T" "E" "N" "Y" "R" "R" "D" "Y" "Q" "A" "I" "F"
## [126] "V" "Q" "Q" "L" "G" "G" "T" "Q" "S" "A" "A" "S" "Q" "I" "D" "E" "I" "Y" "K" "S" "E" "Q" "K" "Q" "K"
## [151] "S" "K" "L" "Q" "M" "V" "L" "D" "P" "L" "A" "I" "M" "G" "S" "S" "A" "A" "S" "A" "Y" "A" "S" "D" "A"
## [176] "F" "D" "S" "K" "F" "T" "T" "K" "A" "P" "I" "V" "A" "A" "K" "G" "T" "K" "T" "G" "R" "*"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top