When I tried to read in a big file of actual size 672MB into R, it turns out that the system memory usage exploded from 0.98 G to 3.6 G (I'm using a 4 GB memory desktop). Which means it takes several times of space to store the file into memory and I can do nothing calculation after I read in as lack of memory. Is that normal? The code I've used: a=read.table(file.choose(),header=T,colClasses="integer",nrows=16777777,comment.char="",sep="\t") The file contains 167772XX lines.

gc() before and after I run enter image description here

not sure what does this mean.

有帮助吗?

解决方案

Your text file is 672MB. Assuming all your integers are 1 digit, it's perfectly reasonable that your R object is about 2*672MB.

Each character in a text file is 1 byte. R stores integers in 4 bytes (see ?integer). That means your file contains ~336MB of "\t" and ~336MB of integers stored as 1-byte characters.

R reads those 1-byte characters, stores them as 4-byte integers and... 336*4 = 1344MB. The second row and second column of your gc output reads 1345.6, which equals 1344MB + the original 1.6MB.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top