Question
I'm sure this one is easy but I've tried a ton of variations and still cant match what I need. The thing is being too greedy and I cant get it to stop being greedy.
Given the text:
test=this=that=more text follows
I want to just select:
test=
I've tried the following regex
(\S+)=(\S.*)
(\S+)?=
[^=]{1}
...
Thanks all.
Solution
here:
// matches "test=, test"
(\S+?)=
or
// matches "test=, test" too
(\S[^=]+)=
you should consider using the second version over the first. given your string "test=this=that=more text follows"
, version 1 will match test=this=that=
then continue parsing to the end of the string. it will then backtrack, and find test=this=
, continue to backtrack, and find test=
, continue to backtrack, and settle on test=
as it's final answer.
version 2 will match test=
then stop. you can see the efficiency gains in larger searches like multi-line or whole document matches.
OTHER TIPS
You probably want something like
^(\S+?=)
The caret ^ anchors the regex to the beginning of the string. The ? after the + makes the + non-greedy.
You might be looking for lazy quantifiers *?, +?, ??, and {n, n}?
You should be able to use this:
(\S+?)=(\S.*)
Lazy quantifiers work, but they also can be a performance hit because of backtracking.
Consider that what you really want is "a bunch of non-equals, an equals, and a bunch more non-equals."
([^=]+)=([^=]+)
Your examples of [^=]{1}
only matches a single non-equals character.
if you want only "text=", I think that a simply:
^(\w+=)
should be fine if you are shure about that the string "text=" will always start the line.
the real problem is when the string is like this:
this=that= more test= text follows
if you use the regex above the result is "this=" and if you modify the above with the reapeater qualifiers at the end, like this:
^(\w+=)*
you find a tremendous "this=that=", so I could only imagine the trivial:
[th\w+=]*test=
Bye.