Domanda

I have this string (which is just the cut out part of a larger string):

00777: 50.000 bit/s

and want to capture the 50.000 bit/s part I've created a positive look-behind regex like this:

(?<=\d{5}: )\S+\s+\S+

Which works but when there are more spaces between the : and the number it doesn't - like expected.

So I did this:

(?<=\d{5}:\s+)\S+\s+\S+

But that doesn't work?! Why? Even this expression doesn't match any string:

(?<=\d{0,5}).*

What is it that I'm missing here?

È stato utile?

Soluzione

This is because many regex engines don't support quantifiers(+,*,?) in lookbehind.

Example:java,javascript

EDIT

Since you are using Java,you can use group

Matcher m=Pattern.compile("\\d{5}:\\s+(\\S+\\s+\\S+)").matcher(input);
if(m.find())
  value=m.group(1);

Altri suggerimenti

In the first one you can use a variable amount of spaces with (?<=\d{5}: +), but like the other answer, it might not be supported by your regex engine.

The last expression doesn't match any string because of the . on the data, it's not part of the \d char class, you could use [\d\.]

As a rule of thumb, I always start writing the simplest regex that will do it and I rely on data patterns that I believe will stay.

If you expect the unit to always be after the number you're after, and it will always be bit/s, there's no reason not to include it as a literal in your regex:

[\d\.]+ bit/s$

Then you can start to turn it into a more complex expression if you find exceptions in your data, like a unit with kbit/s:

(<value>[\d\.]+) *(<unit>\w+)/s$

Using named capture groups so it's easier and more readable to reference them later so can multiply the value by the unit, etc.

In resume: don't use fancier features if you won't really need them.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top