Regex Parsing Beginning of Line
-
21-12-2019 - |
Domanda
I have a string and I would like to parse it using regular expression. ..
indicates the category name and everything after :
is the content for that category.
Below is the full string I'm trying to parse:
..NAME: JOHN
..BDAY: 1/1/2010
..NOTE: 1. some note 1
2. some note 2
3. some note 3
..DATE: 6/3/2014
I'm trying to parse it so that
(group 1)
..NAME: JOHN
(group 2)
..BDAY: 1/1/2010
(group 3)
..NOTE: 1. some note 1
2. some note 2
3. some note 3
(group 4)
..DATE: 6/3/2014 //a.k.a update date
The regular expression patter I use is
\.\.[A-Z0-9]{2,4}:.*
which makes (group 3) ..NOTE: 1. some note 1
missing the content on second and third line.
How can I modify my pattern so I can get the correct grouping?
Soluzione
.
matches all but newline (in most languages, Ruby is one exception). Use RegexOptions.Singleline
in C# (or the s
modifier in PCRE).
You will need to make your .*
lazy up till the next ..
or the end of the string $
so that you don't match everything the first time. Also, .
doesn't have any special meaning in a character class..so your expression may end up looking cleaner like this:
[.]{2}[A-Z0-9]{2,4}:.*?(?=[.]{2}|$)
Altri suggerimenti
I managed to achieve it with the negative lookahead for [.]{2}
:
[.]{2}[A-Z0-9]{2,4}:(.*\n?(?![.]{2}))*