문제

so the story is i have a 30 gig txt file that need to be read into R, it contains two cols and about 2 billion rows of integers! I dont want to load the whole thing in one go, sizeable chunks will suffice.

I've tried using read.table with arguments like nrow = 10000000 and skip = "stupidly_large_number"

but i get the following error when i get far through the file

Error in readLines(file, skip):
    cannot allocate vector of length 1800000000

Please help me get at the data and thanks in advance!

도움이 되었습니까?

해결책

it seems to me that you may need to split the text file into manageable chunks first before trying to process them. The unix split command should do the trick, but I don't know if you're on a platform that command exists on.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top