Domanda

following string:

23434 5465434

58495 / 46949345

58495 - 46949345

58495 / 55643

d 44444 ssdfsdf

64784

45643 dfgh

58495/55643

48593/48309596

675643235

34565435 34545

it only want to extract the bold ones. its a five digit number(german). it should not match telephone numbers 43564 366334 or 45433 / 45663,etc as in my example above.

i tried something like ^\b\d{5} but thats not a good beginning.

some hints for me to get this working?

thanks for all hints

È stato utile?

Soluzione

You could add a negative look-ahead assertion to avoid the matches with phone numbers.

\b[0124678][0-9]{4}\b(?!\s?[ \/-]\s?[0-9]+)

If you're using Ruby 1.9, you can add a negative look-behind assertion as well.

Altri suggerimenti

You haven't specified what distinguishes the number you're trying to search for.

Based on the example string you gave, it looks like you just want: ^(\d{5})\n

Which matches lines that start with 5 digits and contain nothing else.

You might want to permit some spaces after the first 5 digits (but nothing else): ^(\d{5})\s*\n

Here's the regex for the german postal code (Source)

^([0124678][0-9]{4})$

I'm not completely sure about the specified rules. But if you want lines that start with 5 digits and do not contain additional digits, this may work:

^(\d{5})[^\d]*$

If leading white space is okay, then:

^\s*(\d{5})[^\d]*$

Here is the Rubular link that shows the result.

^\D*(\d{5})(\s(\D)*$|()$)

This should (it's untested) match:

  • line starting with five digits (or some non-digits and then five digits), then a space, and ending with some non-numbers
  • line starting and ending with five digits (or some non-digits and then five digits)

\1 would be the five digits

\2 would be the whole second half, if any

\3 would be the word after the digits, if any

edited to fit the asker's edited question

edit again: I came up with a much more elegant solution:

^\D*(\d{5})\D*$
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top