R, readLines, strsplit and grep

https://stackoverflow.com/questions/21271293

01-10-2022
|

Question

I am trying to read a random text file one line at a time. Then split the line into "words" and perform some regex on each word, like finding all word that start with "w". After the following like code snippet I get:

while (length(oneLine <- readLines(infile, n = 1, warn = FALSE)) > 0) {
    myVector <- (strsplit(oneLine, " ", fixed = FALSE, perl = TRUE))
    res <- grep("^w", myVector, perl = TRUE, value = TRUE)
    ...

> myVector
[[1]]
[1] "u"            "rtu"          "jgiyu"        "t6riuri-4e5-" "ee4"          "59"          
[7] "43"

My question is, what is the correct syntax to access "u", "rtu", ... ?

> myVector[1]
[[1]]
[1] "u"            "rtu"          "jgiyu"        "t6riuri-4e5-" "ee4"          "59"          
[7] "43"

Doesn't work. What will? What's up with the [[1]]? I was under the impression that vectors are one-dimensional and its elements are accessed like myVector[1], myVector[2], etc. Thanks for the help.

No correct solution

OTHER TIPS

strsplit returns a list. In this case, it is a list of length 1, but if you used readLines on the whole file, then called strsplit, it would return a list of the same length as the number of lines.

For the way you're using it, you need to select the first element of the first component of the list. i.e. myVector[[1]][1] for "u" and myVector[[1]][2] for "rtu". Also, in this case, unlist(myVector)[1] and unlist(myVector)[2] would work.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow