Question

I have a huge file that looks like this:

7

bla1
blala
blabla
blab
blals
blable
bla

more here..

The first numbers tells how many values I will have. The thing, is that i just want to point directly to the line 11 (text "more here.."), without having to read all those values before. In my case, I have a big amount of numbers, so it has to be optimized.

Would you recommend me something?

Was it helpful?

Solution

You can make something file-like that will skip past the first N lines:

SkipFile.open("/tmp/frarees") do |ln|
  puts ln                                   # "more here.." and so on
end

puts SkipFile.new("/tmp/frarees").readline  # "more here.."

Like so:

class SkipFile
  def self.open(fn, &block)
    sf = SkipFile.new(fn)
    return sf unless block
    sf.each(&block)
  end

  def initialize(fn)
    @f = File.open(fn)
    skip = @f.readline.to_i     # Skip N lines as prescribed by the file
    skip.times { @f.readline }  # this could be done lazily
  end

  def each(&block)
    @f.each(&block)
  end

  def readline
    @f.readline
  end
end

Easy to do if you just want to iterate forward through lines of a file. It becomes arduous, however, if you want to mimic the File or IO interface exactly (but see Delegate) and especially if you want to support rewindability back to the fake start of your file.

OTHER TIPS

You could probably use File#seek to randomly access the file.

The problem with that approach is that it will just access data at a specified byte offset - not a line offset. If your file could give the byte offset at the start of the file to where the list finishes, then you could use that.

Here's an elegant way to do it, probably not very efficient though as it requires loading the whole file into memory at once.

File.readlines(file_path)[10..-1] # indexing starts from 0

I don't think you're going to get any more efficient than this, since you'll have read the bytes in the file to figure out what is a "line".

f = File.open('./data')
(f.readline.to_i + 2).times { f.readline }
p f.readline
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top