I haven't heard of something like that, but here's an idea for estimating csv sizes, at least.
csvSizeEst <- function(obj, frac=0.01) {
tf <- tempfile()
on.exit(unlink(tf))
n <- ceiling(nrow(obj) * frac)
write.csv(obj[seq_len(n),], file=tf)
1/frac * file.info(tf)$size
}
x <- data.frame(replicate(5, rnorm(500)))
## Estimated file size, based on a 1% sample (the default sample size)
csvSizeEst(x)
# [1] 50700
## Set fraction of file to 1 to get actual file size
csvSizeEst(x, frac=1)
# [1] 48904
Also, to get an order of magnitude sense of the observed relationship between data.frame size in R (as reported by object.size
) and when written out as .csv files, try the following. (As a +/- representative sample, I here examine all of the data.frames shipped in the datasets package.)
oo <- ls("package:datasets")
dfs <- oo[sapply(oo, function(X) is.data.frame(get(X)))]
r <- sapply(dfs, function(X) {
X <- get(X)
csvSizeEst(X,1)/object.size(X)
})
hist(r, breaks=20, col="lightgrey", xlim=c(0,1.5),
main="Ratio of size-on-disk to object.size in R")