I would use something that parses JSON, what your data seems to be:
s <- "{'#JJ': 121, '#NN': 938, '#DT': 184, '#VB': 338, '#RB': 52}"
parse.one <- function(s) {
require(rjson)
v <- fromJSON(gsub("'", '"', s))
data.frame(id = gsub("#", "", names(v)),
value = unlist(v, use.names = FALSE))
}
parse.one(s)
# id value
# 1 JJ 121
# 2 NN 938
# 3 DT 184
# 4 VB 338
# 5 RB 52
For the second part of the question, I would pass a slightly modified version of the parse.one
function through lapply
, then let plyr's rbind.fill
function align the pieces together while filling missing values with NA
:
df <- data.frame(m = c(
"{'#JJ': 121, '#NN': 938, '#DT': 184, '#VB': 338, '#RB': 52}",
"{'#NN': 168, '#DT': 59, '#VB': 71, '#RB': 5, '#JJ': 35}",
"{'#JJ': 18, '#NN': 100, '#DT': 23, '#VB': 52, '#RB': 11}",
"{'#JJ': 12, '#VB': 5}"
))
parse.one <- function(s) {
require(rjson)
y <- fromJSON(gsub("'", '"', s))
names(y) <- gsub("#", "", names(y))
as.data.frame(y)
}
library(plyr)
rbind.fill(lapply(df$m, parse.one))
# JJ NN DT VB RB
# 1 121 938 184 338 52
# 2 35 168 59 71 5
# 3 18 100 23 52 11
# 4 12 NA NA 5 NA