Question

I have log file containing somewhere five * in two places. The file can be big.

Log record
*****
Log record
Log record
*****
Log record

I would like to get everything which is between five *. Right, I can read line by line but perhaps there are better solutions like parsing using Regular Expressions in Groovy?

Thank you.

Was it helpful?

Solution

You could also write a custom Reader like:

class DelimitedReader extends BufferedReader {
    String delimiterLine

    DelimitedReader( String delimiterLine, Reader reader ) {
        super( reader )
        this.delimiterLine = delimiterLine
        scanUntilDelimiter()
    }

    private scanUntilDelimiter() {
        String line = super.readLine()
        while( line != null && line != delimiterLine ) {
            line = super.readLine()
        }
    }

    String readLine() {
        String line = super.readLine()
        if( line == delimiterLine ) {
            line = null
        }
        line
    }
}

And then, you can do something like this to iterate over them

new File( '/tmp/test.txt' ).withReader { r ->
    new DelimitedReader( '*****', r ).eachLine { line ->
        println line
    }
}

This saves you having the whole file loaded in to a single (potentially huge) string

OTHER TIPS

Try this regex:

(?s)(?<=[*]{5}).+(?=[*]{5})

Demo

http://groovyconsole.appspot.com/script/2405001

This regex matches everything between the first ***** and the next one:

(?<=\*{5})[\s\S]*(?=\*{5})
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top