Question

new Regex(@"^[a-zA-Z]+\b +\b[a-zA-Z]?\b +\b[a-zA-Z]+$")

this matches

John Smith John B Goode

I am trying to modify this regex for the following cases:

some text before 12359 (John B? Goode) 10249?

that is sometimes the name comes after the a number at the end of the string and optionally before a final number at the end.

I have tried

new Regex(@"^|[0-9]+([a-zA-Z]+\b +\b[a-zA-Z]?\b +\b[a-zA-Z]+) *[0-9]*?$")

but that does not work because

  1. the ^|[0-9]+ only matches numbers anymore and not beginning of line
  2. the group is always an empty string that match something like sometext 12354 (the first number needs to not be at the end of a line.

Update

This is all water under the bridge because I found more names at the end of the lines of data so this will not work.

However the solution to my problem was not throwing the OR in a group.

Was it helpful?

Solution

You need parentheses around the alternation:

(^|[0-9]+)

Your expression is equivalent to this:

new Regex(@"^|()")

It always matches the start of the string and nothing else.

OTHER TIPS

Edit (re Alan Moore's info)

Another try. The problem statement is unclear as to if you're going for a FULL valdation or just trying to extract the name, validating just the surrounding extraction.

If attempting a %100 validation extraction, then you should be concerned about the BOL.
Otherwise, you only need to worry about the EOL.

For %100 validation:

(?:^|[0-9]+\ +)([a-zA-Z]+\ +(?:[a-zA-Z]\ +)?[a-zA-Z]+)(?:\ +[0-9]+)?$

Expanded:

(?:  ^              # BOL
   | [0-9]+ \ +     # or, leading numbers + space
)
(                      # Capt 1
   [a-zA-Z]+               # first name
   \ +                     # space
   (?: [a-zA-Z] \ + )?     # optional middle initial + space
   [a-zA-Z]+               # last name
)                      # End Capt 1
(?: \ + [0-9]+ )?      # optional space + trailing numbers
$                   # EOL

Or, if you just want to extract the text, only the EOL anchor is needed and some restrictions can be loosened:

\b([a-zA-Z](?:\s+[a-zA-Z.]+)*)[\s\d]*$
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top