Question

I am a python newbie and want to learn it the hardway. I am writing a function to extract content between patterns. The log file construct is as follows

<Time-stamp>[Begin cache] <...Some content>
<Time-stamp>. 
<Time-stamp>.
<Time-stamp>.
<Time-stamp>.
<Time-stamp>[ERROR] <..Some content>
<Time-stamp>.
<Time-stamp>.
<Time-stamp>.
<Time-stamp>.
<Time-stamp>[End cache] <....some content>
<Time-stamp>.
<Time-stamp>.
<Time-stamp>.
<Time-stamp>[Begin cache] <... Some content>
<Time-stamp>.
<Time-stamp>.
<Time-stamp>.
<Time-stamp>[End cache] <... Some content>

I am interested in extracting the part between Begin cache and End cache only if there is a pattern ERROR between them. The code I have written so far is no way getting me to that goal. The logic I used is to find positions of Begin cache pattern and End cache pattern if ERROR tag is present and print the file between the positions. Any help would be greatly appreciated.

import re
import os
import mmap
File="\\\\XXXXX\c$\EGO\soam\work\XXXX_20140307.03\dal_XXXX_YYYY_20140320_110536_21_6508.log"
with open(File,"r") as file:
    m=mmap.mmap(file.fileno(),0,access=mmap.ACCESS_READ)
    mpattern="\[ERROR\]"
    spattern="Begin cache"
    epattern="End cache"
    mregexp=re.compile(mpattern)
    sregexp=re.compile(spattern)
    eregexp=re.compile(epattern)
    for match in sregexp.finditer(m):
        epos=eregexp.match(m,match.end())
        if mregexp.match(m,match.end(),epos):
            print("%s"%(m,match.start(),epos))

I would also wish to have some good tutorials for a fast start to this incredibly simple yet confusing language.

Was it helpful?

Solution

you can just scan your log file for [ERROR] and get the text needed where Regex is only used to split the data read from the logfile. I suggest this example method:

Edit after the data format changed:

use the regex : \[[^R]\w+\s\w+\] to split the list and view the ERROR part like in this example:

import re
f = open('logfile', 'r')
data = f.read()
f.close()
mylist = re.split(r'\[[^R]\w+\s\w+\]',data)
for item in mylist:
    if '[ERROR]' in item:
        print item

Edit:

some places to help you learn more python:

learnpython.org

python.org/tutorial

sthurlow.com/python

hope this helped.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top