Question

Got some troubles with my regex.

I got some lines like this:

SomeText#"C:\\","Shadow Copy Components:\\","E:\\",""
SomeText#"D:\\"
SomeText#"E:\\","Shadow Copy Components:\\"
SomeText#"SET SNAP_ID=serv.a.x.com_1380312019","BACKUP H:\\ USING \\\\?\\GLOBALROOT\\Device\\HarddiskVolumeShadowCopy47\\ OPTIONS:ALT_PATH_PREFIX=c:\\VERITAS\\NetBackup\\temp\\_vrts_frzn_img_3200\"
SomeText#"SET SNAP_ID=serv.a.x.com_1380312019","BACKUP Y:\\Libs USING \\\\?\\GLOBALROOT\\Device\\HarddiskVolumeShadowCopy47\\ OPTIONS:ALT_PATH_PREFIX=c:\\VERITAS\\NetBackup\\temp\\_vrts_frzn_img_3200\"

What i would like is to get a group named jobFileList containing for each line:

"C:\\","Shadow Copy Components:\\","E:\\",""
"D:\\"
"E:\\","Shadow Copy Components:\\"
H:\\
Y:\\Libs

You can see i only want the file list, but some times its only the full text after the # mark and sometimes there is a lot of ** that i need to remove. Fact is i cant use a script for this case so i need to do this with only ONE regexp, can't just do a streplace of other stuff after the regex.

What i did is :

SomeText(#.*BACKUP (?P<jobFileList>.*?) .*)?(#(?P<jobFileList>.*))?

But seems i cant set the same GroupName :( If i replace the second jobFileList with another name its works perfectly but not what i need .

Thanks for your help,

EDIT: I can also have some lines like :

SomeText#/ahol5d72_1_2
SomeText#/p7ol4a1p_1_2
SomeText#Gvadag04SANDsk_Daily
SomeText#/bck_reco_a9ol5765_1_2_827497669

In all these cases i need to have all the text after the # mark.

Was it helpful?

Solution

A version which doesn't rely on the double quotes after the double backslash:

SomeText#(?:(.*?BACKUP) )?(?P<jobFileList>(?(1)[^ ]*|.*$))

This: (?(1)[^ ]*|.*$) is a conditional group that is supported in Python 2.7.5 (probably works for higher versions but I don't know for previous ones). If there's BACKUP, it grabs all the non-spaces and if there's no BACKUP, it grabs everything till the end of the string.

regex101 demo

EDIT: As per comment, the regex that worked after @timmalos' modifications:

\#(?P<G>.*?[^E]BACKUP\s)?(?P<G2>f:\\\\Mailbox\\\)?(?P<jobFileList>(?(G)(?(G2)[^\]|\S)‌​*|.*))

OTHER TIPS

This is possible to match with a single regular expression however I know nothing of splunk. Maybe this will help:

("?[A-Z]:\\\\(?:".+|\S+)?)

Live demonstration here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top