read.xlsx takes a very long time and tons of memory
문제
I'm trying to load an .xlsx file into R that has one sheet and is about 31 MB in size.
I run the following
options( java.parameters = "-Xmx6g" )
require(xlsx)
yt = read.xlsx("big_spreadsheet.xlsx",1)
and I get nothing. My system monitor program shows that the allotted memory slowly fills up and then just stays full. I haven't let it run for hours but ten minutes should be sufficient, especially when I could have just loaded into Numbers (I'm on Mavericks) and saved it as a CSV in that time.
Yes, I have much more than 6 GB of memory. 2 GB doesn't seem to be enough and yields the error:
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.lang.OutOfMemoryError: Java heap space
I did, however, make the mistake of letting the rJava
package install its own version of Java. I downloaded JDK 8 after the fact but I have no idea how to check if this is being used.
So why does it take 6 GB of RAM to (fail to) load a 31 MB file? Can I fix this somehow?
해결책
I never got this to work. I've lately been using the readxl
package for reading from Excel spreadsheets, which has no Java dependency and seems to work just fine.
다른 팁
yes, use readxl package,since xlsx package require java that takes a loads of time to load and chances are more that it will return error on reading a file of mere size(even of 2mb)
Its very simple to use just write
read_excel("path")