Single-Pass File Scanning

https://stackoverflow.com/questions/18914013

29-06-2022
|

Question

In my file scanning D program I'm implementing a logic for finding all hits of set of key strings together with line and column context similar to Grep.

My current algorithm works by calling find until end of file. When a hit is found I search backwards and forwards to detect byte offset for beginning and end of the hit line. Then I search backwards again to find number of newlines between beginning of file and my hit start offset. This if of course not an efficient nor elegant solution but it currently works and has helped understand how I operate on slices.

I now want to refactor this code to make use some combination of state machines (Monads) that only needs to go throw the file once and that updates and operates on an array of line-starts found so far (size_t[]). What std.algorithms should base such a solution upon? This algorithm should output a array of tuples where each tuple contains a hit-slice, bol/eol-slice and line-number.

Solution

it is much simpler and easier to just iterate over all lines and keep the current line number

foreach(n, line; lines(file))
{
    auto index = indexOf(line,needle);
    if(index>=0){
        writeln(n, ", ", index);
    }
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow