import io
import re
import sys
file = io.StringIO('''
title|Head1|Head2|Head3|head4
----|------|-----|-----|
1|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
2|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
3|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
4|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
5|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
All|processes:|MemAlloc|=|408125440|(None, None)|0.0.0.0
|(None, None)
0.0.0.0 ,text
''')
sys.stdout.writelines(line for line in file if re.match('\d+\|', line))
Python 3: Regex matching 2 seperate conditions
-
01-06-2022 - |
Question
I'm trying to workout the best to only print the numbered lines. The code is only partially completed as I'm still new to regex in general so may not be using the right method or syntax. Individually the re.matches work fine, it's when I combine them that I get unwanted results:
Sample string:
file = '''
title|Head1|Head2|Head3|head4
----|------|-----|-----|
1|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
2|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
3|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
4|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
5|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
All|processes:|MemAlloc|=|408125440|(None, None)|0.0.0.0
|(None, None)
0.0.0.0 ,text
'''
import re
for line in file:
pat= re.match('(^[A-Z][a-z])|(^--.+)',line) # or use re.match('^[0-9]',line) and match pat != None
patIP = re.match ('^{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}',line)#
if patIP == None or pat == None:
print(line)
I'm stuck on the logic for printing only the numbered lines,.. I maybe completely off.. Keep in mind I don't want to print the 0.0.0.0(IP addresses) line.
desired output:
1|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
2|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
3|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
4|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
5|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
Solution
OTHER TIPS
You can try this:
import re
file = '''
title|Head1|Head2|Head3|head4
----|------|-----|-----|
1|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
2|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
3|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
4|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
5|1150976|0|25300992|bfa92720/bfa924f8|su|(None, None)
All|processes:|MemAlloc|=|408125440|(None, None)|10.93.103.73|(None, None)
0.0.0.0 ,text
'''
matches = re.findall(r'^\d+\|.*$', file, re.MULTILINE)
for match in matches:
print match
When you use the multiline mode, ^
and $
stand for begining of the line and end of the line
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow