The toJSONArray2
function in rCharts
is slow mainly due to the use of RJSONIO
. I am in the process of updating it to a faster implementation using rjson
. Here is what I have so far. I have borrowed the idea of the orient
argument from pandas
.
to_json = function(df, orient = "columns", json = T){
dl = as.list(df)
dl = switch(orient,
columns = dl,
records = do.call('zip_vectors_', dl),
values = do.call('zip_vectors_', setNames(dl, NULL))
)
if (json){
dl = rjson::toJSON(dl)
}
return(dl)
}
zip_vectors_ = function(..., names = F){
x = list(...)
y = lapply(seq_along(x[[1]]), function(i) lapply(x, pluck_(i)))
if (names) names(y) = seq_along(y)
return(y)
}
pluck_ = function (element){
function(x) x[[element]]
}
The example below will show you that to_json
is 20x faster than toJSONArray2
, most of which is coming due to the use of rjson
rather than RJSONIO
.
N = 10^3
df <- data.frame(
x = rpois(N, 10),
y = sample(LETTERS, N, replace = T),
z = rpois(N, 5)
)
library(microbenchmark)
autoplot(microbenchmark(
to_json(df, orient = "values", json = T),
toJSONArray2(df, names = F),
times = 5
))
UPDATE: On more carefully reading through your question, I realized that we could speed it up further by using dplyr
and to_json
library(dplyr)
dfl = df %.%
group_by(z) %.%
do(function(x){
to_json(x[-3], orient = 'values', json = F)
})