FF in R: No Applicable Method for 'recodeLevels'

https://stackoverflow.com/questions/22357396

13-06-2023
|

Question

I'm trying to load a huge (~5GB) .csv file into R using read.csv.ffdf. The command goes:

npi <- read.csv.ffdf(file="C:/Users/DSA/Dropbox/Team Shared Files/People/Ross/NPI_Parse/Zips/npi_full.csv", VERBOSE=TRUE, first.rows=10000,next.rows=100000,colClasses=NA)

The command runs for a while and then throws the following error: "no applicable method for 'recodeLevels' applied to an object of class "c('double', 'numeric')." Some searching tells me I need to use the transFUN option but I have no idea how to apply it. The data is both text and numbers and I think that may be causing issues. I can upload a screenshot of the csv if it helps but it takes ages to open in LibreOffice.

Anyone know any tricks?

Solution

From the documentation of read.csv.ffdf.

transFUN: NULL or a function that is called on each data.frame chunk after reading with FUN and before further processing (for filtering, transformations etc.)

If one of your columns changes from being a factor to a numeric or vice versa, make sure it is a factor using transFUN

npi <- read.csv.ffdf(
  file="C:/Users/DSA/Dropbox/Team Shared Files/People/Ross/NPI_Parse/Zips/npi_full.csv",
  VERBOSE=TRUE, first.rows=10000,next.rows=100000, 
  transFUN=function(x){
    x$yourcolumnwiththeerror <- factor(as.character(x$yourcolumnwiththeerror))
    x
  })

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow