ff package in R: how to move data from one drive to another, and change filenames

StackOverflow https://stackoverflow.com/questions/17464731

  •  02-06-2022
  •  | 
  •  

Question

I am working intensively with the amazing ff and ffbase package. Due to some technical details, I have to work in my C: drive with my R session. After finishing that, I move the generated files to my P: drive (using cut/paste in windows, NOT using ff).

The problem is that when I load the ffdf object:

load.ffdf("data") 

I get the error:

Error: file.access(filename, 0) == 0 is not TRUE

This is ok, because nobody told the ffdf object that it was moved, but trying :

filename(data$x) <- "path/data_ff/x.ff"

or

pattern(data) <- "./data_ff/"

does not help, giving the error:

Error in `filename<-.ff`(`*tmp*`, value = filename) : 
ff file rename from 'C:/DATA/data_ff/id.ff' to 'P:/DATA_C/data_ff/e84282d4fb8.ff' failed. 

Is there any way to "change" into the ffdf object the path for the files new location? Thank you !!

Était-ce utile?

La solution

If you want to 'correct' your filenames afterwards you can use:

physical(x)$filename <- "newfilename"

For example:

> a <- ff(1:20, vmode="integer", filename="./a.ff")
> saveRDS(a, "a.RDS")
> rm(a)
> file.rename("./a.ff", "./b.ff")
[1] TRUE
> b <- readRDS("a.RDS")
> b
ff (deleted) integer length=20 (20)
> physical(b)$filename <- "./b.ff"
> b[]
opening ff ./b.ff
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

Using filename() in the first session would of course have been easier. You could also have a look at the save.ffdf and corresponding load.ffdf functions in the ffbase package, which make this even simpler.

Addition

To rename the filenames of all columns in a ffdf you can use the following function:

redir <- function(ff, newdir) {
  for (x in physical(b)) {
    fn <- basename(filename(x))
    physical(x)$filename <- file.path(newdir, fn)
  }
  return (ff)
}

Autres conseils

You can also use ff:::clone()

R> foo <- ff(1:20, vmode = "integer")
R> foo
ff (open) integer length=20 (20)
 [1]  [2]  [3]  [4]  [5]  [6]  [7]  [8]      [13] [14] [15] [16] [17] [18] [19]
   1    2    3    4    5    6    7    8    :   13   14   15   16   17   18   19
[20]
  20
R> physical(foo)$filename
[1] "/vol/fftmp/ff69be3e90e728.ff"
R> bar <- clone(foo, pattern = "~/")
R> bar
ff (open) integer length=20 (20)
 [1]  [2]  [3]  [4]  [5]  [6]  [7]  [8]      [13] [14] [15] [16] [17] [18] [19]
   1    2    3    4    5    6    7    8    :   13   14   15   16   17   18   19
[20]
  20
R> physical(bar)$filename
[1] "/home/ubuntu/69be5ec0cf98.ff"

From what I understand from briefly skimming the code of save.ffdf and load.ffdf, those functions do this for you when you save/load.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top