ff package in R: how to move data from one drive to another, and change filenames

StackOverflow https://stackoverflow.com/questions/17464731

  •  02-06-2022
  •  | 
  •  

Question

I am working intensively with the amazing ff and ffbase package. Due to some technical details, I have to work in my C: drive with my R session. After finishing that, I move the generated files to my P: drive (using cut/paste in windows, NOT using ff).

The problem is that when I load the ffdf object:

load.ffdf("data") 

I get the error:

Error: file.access(filename, 0) == 0 is not TRUE

This is ok, because nobody told the ffdf object that it was moved, but trying :

filename(data$x) <- "path/data_ff/x.ff"

or

pattern(data) <- "./data_ff/"

does not help, giving the error:

Error in `filename<-.ff`(`*tmp*`, value = filename) : 
ff file rename from 'C:/DATA/data_ff/id.ff' to 'P:/DATA_C/data_ff/e84282d4fb8.ff' failed. 

Is there any way to "change" into the ffdf object the path for the files new location? Thank you !!

Was it helpful?

Solution

If you want to 'correct' your filenames afterwards you can use:

physical(x)$filename <- "newfilename"

For example:

> a <- ff(1:20, vmode="integer", filename="./a.ff")
> saveRDS(a, "a.RDS")
> rm(a)
> file.rename("./a.ff", "./b.ff")
[1] TRUE
> b <- readRDS("a.RDS")
> b
ff (deleted) integer length=20 (20)
> physical(b)$filename <- "./b.ff"
> b[]
opening ff ./b.ff
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

Using filename() in the first session would of course have been easier. You could also have a look at the save.ffdf and corresponding load.ffdf functions in the ffbase package, which make this even simpler.

Addition

To rename the filenames of all columns in a ffdf you can use the following function:

redir <- function(ff, newdir) {
  for (x in physical(b)) {
    fn <- basename(filename(x))
    physical(x)$filename <- file.path(newdir, fn)
  }
  return (ff)
}

OTHER TIPS

You can also use ff:::clone()

R> foo <- ff(1:20, vmode = "integer")
R> foo
ff (open) integer length=20 (20)
 [1]  [2]  [3]  [4]  [5]  [6]  [7]  [8]      [13] [14] [15] [16] [17] [18] [19]
   1    2    3    4    5    6    7    8    :   13   14   15   16   17   18   19
[20]
  20
R> physical(foo)$filename
[1] "/vol/fftmp/ff69be3e90e728.ff"
R> bar <- clone(foo, pattern = "~/")
R> bar
ff (open) integer length=20 (20)
 [1]  [2]  [3]  [4]  [5]  [6]  [7]  [8]      [13] [14] [15] [16] [17] [18] [19]
   1    2    3    4    5    6    7    8    :   13   14   15   16   17   18   19
[20]
  20
R> physical(bar)$filename
[1] "/home/ubuntu/69be5ec0cf98.ff"

From what I understand from briefly skimming the code of save.ffdf and load.ffdf, those functions do this for you when you save/load.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top