so the story is i have a 30 gig txt file that need to be read into R, it contains two cols and about 2 billion rows of integers! I dont want to load the whole thing in one go, sizeable chunks will suffice.

I've tried using read.table with arguments like nrow = 10000000 and skip = "stupidly_large_number"

but i get the following error when i get far through the file

Error in readLines(file, skip):
    cannot allocate vector of length 1800000000

Please help me get at the data and thanks in advance!

有帮助吗?

解决方案

it seems to me that you may need to split the text file into manageable chunks first before trying to process them. The unix split command should do the trick, but I don't know if you're on a platform that command exists on.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top