Here is one solution without mclapply
, please check if it is feasible:
L <- regmatches(my.vector, gregexpr("(\\d+)(?=[A-HJ-Z])", my.vector, perl=TRUE))
sapply(L, function(x)paste0(sum(as.numeric(x)),"N"))
Question
I have the following vector:
my.vector = c("4M1D5M15I1D10M", "3M", "4M2I3D")
And I'd like to transform it into the following vector:
my.result = c("21N", "3N", "7N")
The logic for such results is as follows,
for "4M1D5M15I1D10M"
I added all the numbers, except the ones that are preceding an "I"
character, i.e., 4+1+5+1+10=21 (I did not add 15 because it precedes an "I"
), and then paste an N right after 21, becoming "21N"
.
Same for "3M"
, there is no "I"
character so it just becomes "3N"
;
and same for the last one, 4+3=7 (I did not add 2 because it precedes an "I"
), becoming "7N"
.
Note that my.vector is extremely large so I want to use the parallel capabilities of the HPC server using mclapply. Ideally I'd run something like this to get my result:
my.result = unlist(mclapply(my.vector, my.adding.function, mc.cores = ncores))
For defining my function I tried the following:
my.adding.function <- function(x)
{
tmp = unlist(strsplit(x, "\\d+I"))
tmp2 = unlist(strsplit(tmp, "M|D|S|N"))
tmp3 = sum(as.numeric(tmp2))
return(paste(tmp3, "N",sep=""))
}
Not sure about the efficiency of such function though...
Solution
Here is one solution without mclapply
, please check if it is feasible:
L <- regmatches(my.vector, gregexpr("(\\d+)(?=[A-HJ-Z])", my.vector, perl=TRUE))
sapply(L, function(x)paste0(sum(as.numeric(x)),"N"))