There are advantages to this if your data is large and you have to modify it by passing it through functions. When you send data.frame
s or vector
s to functions that modify them, R will make a copy of the data before making changes to it. You'd then return the modified data from the function and overwrite the old data to complete the modification step.
If your data is large, copying the data for each function call may result in an undesirable amount of overhead. Using environment
s provides a way around this overhead. environment
s are handled differently by functions. If you pass an environment
to a function and modify the contents, R will operate directly on the environment
without making a copy of it. So by putting your data in an environment
and passing the environment
to the function instead of directly passing the data, you can avoid copying the large dataset.
# here I create a data.frame inside an environment and pass the environment
# to a function that modifies the data.
e <- new.env()
e$k <- data.frame(a=1:3)
f <- function(e) {e$k[1,1] <- 10}
f(e)
# you can see that the original data was changed.
e$k
a
1 10
2 2
3 3
# alternatively, if I pass just the data.frame, the manipulations do not affect the
# original data.
k <- data.frame(a=1:3)
f2 <- function(k) {k[1,1] <- 10}
f2(k)
k
a
1 1
2 2
3 3