MultiLine Regular Expression and outputting to a file in windows
Question
I have an log file that I need to extract specific patterns from. I need to find and then process them into a new file. grep on Linux would usually do the trick but the regular expression spans multiple lines, which I understand grep does not do.
here is an example from my log/debug file:
Da:
1.328 0.5045
Db:
0.6415 0.1192
Lambda:
0.4429 -0.35
-0.0461 -0.02421
seps:
0.714272
I'm looking for /Lambda:\n([-\d\.]+)\s+([\-\d\.]+)\s+\n([\-\d\.]+)\s+([\-\d\.]+)/
I then want to output the lines to a new file removing the lambda and rearrange the numbers onto the same line so output \1\s\2\s\3\s\4\n
So I have actually two questions:
- Is there an easy utility to accomplish this, on any system?
- Is there a way to do this specifically on windows?
I'm hoping there is a simple solution to this that has escaped me. I would rather stay in windows but if I have to go to Linux I will to get this done.
Solution 3
thanks for all the answers. I like the answers you gave me for the perl and awk. I'm one of those weird programmers that doesn't know perl, so I took the ruby route. here is my solution
x=ARGV[0]
f=File.new(ARGV[1])
g=File.new(ARGV[2],"w")
f.read.gsub(/#{x}:\s*(([\d\.\-]*\t*)+\n)+/ ){|entry|
puts entry
g.puts entry.gsub(/#{x}:\n/,'').gsub(/\s+/,"\t").strip
}
this I can use as a utility with my editor Notepad++ through NppExec, which does not support redirect and piping, as far as I know. This also allows for me to collect any of the output that I need to diagnose by program. Thanks again y'all.
OTHER TIPS
This is a good candidate for awk
, perl
and the like stateful parsing (these will run in both Windows's CMD.EXE
, provided you have perl
and/or awk/sed
in your PATH
, as well as, of course, on Linux and other unices):
awk "/^Lambda/ { in_lambda=1 ; next } in_lambda && /^ *$/ { in_lambda=0 ; printf \"\n\" ; next } in_lambda { printf \"%s \", $0 }" input_file >output_file
or
perl -ne "chomp; if (/^Lambda/) { $in_lambda = 1 } elsif ($in_lambda && /^ *$/) { $in_lambda=0 ; printf \"\n\" } elsif ($in_lambda) { printf \"%s \", $_ }" input_file >output_file
You can perform a second pass to normalize whitespace (and trim whitespace at the end of the lines) if needed.
awk "/^Lambda/ { in_lambda=1 ; next } in_lambda && /^ *$/ { in_lambda=0 ; printf \"\n\" ; next } in_lambda { printf \"%s \", $0 }" input_file
| sed -e "s: *: :g" -e "s: *$::" >output_file
or
perl -ne "chomp; if (/^Lambda/) { $in_lambda = 1 } elsif ($in_lambda && /^ *$/) { $in_lambda=0 ; printf \"\n\" } elsif ($in_lambda) { printf \"%s \", $_ }" input_file
| perl -ne "s/ +/ /g; s/ +$//g; print" >output_file
You could install Perl or Python or Ruby or PHP and write the script fairly easily.