Question

Match a string that ends with a ; make it optional and do not capture that group as it is not required. But do capture the rest of the string that occurs after the first condition.

Text1: transfer from source not possible; snapmirror may be misconfigured, the source volume may be busy or unavailable.

Text2: snapmirror may be modified, the destination volume is unavailable.

Desired OUTPUT:

snapmirror may be misconfigured, the source volume may be busy or unavailable

snapmirror may be modified, the destination volume is unavailable

I want my regex to look for the 'transfer from source not possible' or any string that occurs in that way before a semi-colon and I want my regex not to capture this as a group.

Also, I want to capture everything that occurs after a semi-colon till the end.

Regex tried: (?:.*;)? (.+)\..*

The above regex works for Text1 but not for Text2. Anyone help me fix this please?

Was it helpful?

Solution

Looks to me like your regex should work, but it probably can be improved.

  • Should the match always start at the beginning of the string? (That's a trick question; if you can't make that assumption, the question makes no sense at all.)

  • Will there ever be more than one semicolon? If so, do you want the non-capturing part to extend only up to the first one, or to the last one?

  • Should it always end at the end of the string, or do you only want to match up to the period? Can there ever be more than one period?

This minimally changed regex seems to do what you want, and the added anchors will probably improve performance enough to meet your needs:

^(?:.*?;\s*)?(.+)\..*$

This one should be considerably faster; replacing .* with negated character classes ([^;]* and [^.]+) almost completely eliminates backtracking:

^(?:[^;]*;\s*)?([^.]+)

If you're using a regex flavor that supports them, atomic groups and possessive quantifiers can make it even faster:

^(?>[^;]*+;\s*+)?+([^.]++)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top