Question

I am trying to clean log files to categorize them in Splunk, so my question is:

(?i)^(?:[^ ]* ){8}(?P.+)((?=\d{8}\d{8}\d{10}.)|(?=\d{8}.?))

bold area needs to be combined like if/else,[it selects good before bold lines]

I want it to STOP just before just-8 digits and dot (ddddddd.) OR 8digit_8digit_10digits (8xd_8xd_10xd.)

my task is to get rid of all unique numbers on log file that i can categorize it better.

please help

Was it helpful?

Solution

You could just make the previous .+ lazy:

(?i)^(?:[^ ]* ){8}(?P<FIELDNAME>.+?)((?=\d{8}_\d{8}_\d{10}\.)|(?=\d{8}\.))
                                  ^

Being greedy, it will stop as far as possible from the initial match. I removed the ? at the end since it would make it stop as soon as there're 8 digits ahead. Also, you can actually combine these lookaheads:

(?i)^(?:[^ ]* ){8}(?P<FIELDNAME>.+?)(?=\d{8}(?:_\d{8}_\d{10})?\.)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top