How do I get Ruby to search for a pattern on the tail of a local file?

Question 1

Use IO.open to read the file and Enumerable#grep to search the desired text using a regular expression like the following code does:

def get_last_blah(filename)
  open(filename) { |f| f.grep(/.*blah(\d).*$/){$1}.last.to_i }
end

puts get_last_blah('var/test/log')
# => 3

The method return the number in from of the last "blah" word of the file. It is reading the entire file but the result is the same as if is done with tail.

If you want to use a proper tail, take a look at the File::Tail gem.

Question 2

I presume you wish to avoid reading the entire file each time; rather, you want to start at the end and work backward until you find the last string of interest. Here's a way to do that.

Code

BLOCK_SIZE = 30
MAX_BLAH_NBR = 123

def doit(fname, blah_text)
  @f = File.new(fname)
  @blah_text = blah_text
  @chars_to_read = BLOCK_SIZE + @blah_text.size + MAX_BLAH_NBR.to_s.size
  ptr = @f.size
  block_size = BLOCK_SIZE
  loop do
    return nil if ptr.zero?
    ptr -= block_size
    if ptr < 0
      block_size += ptr
      ptr = 0
    end
    blah_nbr = read_block(ptr)
    (f.close; return blah_nbr.to_i) if blah_nbr
  end
end

def read_block(ptr)
  @f.seek(ptr)
  @f.read(@chars_to_read)[/.*#{@blah_text}(\d+)/,1]
end

Demo

Let's first write something interesting to a file.

MY_FILE = 'my_file.txt'

text =<<_
Now is the time
for all blah2 to
come to the aid of
their blah3, blah4 enemy or
perhaps do blagh5 something
else like wash the dishes.
_

File.write(MY_FILE, text)

Now run the program:

p doit(MY_FILE, "blah") #=> 4

We expected it to return 4 and it did.

Explanation

doit first instructs read_block to read up to 37 characters, beginning BLOCK_SIZE (30) characters from the end of the file. That's at the beginning of the string

"ng\nelse like wash the dishes.\n"

which is 30 characters long. (I'll explain the "37" in a moment.) read_block finds no text matching the regex (like "blah3"), so returns nil.

As nil was returned, doit makes the same request of read_block, but this time starting BLOCK_SIZE characters closer to the beginning of the file. This time read_block reads the 37 character string:

"y or\nperhaps do blagh5 something\nelse"

but, again, does not match the regex, so returns nil to doit. Notice that it read the seven characters, "ng\nelse", that it read previously. This overlap is necessary in case one 30-character block ended, "...bla" and the next one began "h3...". Hence the need to read more characters (here 37) than the block size.

read_block next reads the string:

"aid of\ntheir blah3, blah4 enemy or\npe"

and finds that "blah4" matches the regex (not "blah3", because the regex is being "greedy" with .*), so it returns "4" to doit, which converts that to the number 4, which it returns.

doit would return nil if the regex did not match any text in the file.