Is there a way I can extract parts of a name from a string, using regular expression or other logic.
I would like to split names by spaces, but in the case that a name is prefixed, I would like to split on the prefix, e.g.
Osama bin Laden bin Mohammed => Osama, bin Laden, bin Mohamed
Jorge do Pinto da Silva => Jorge, do Pinto, da Silva
John Andrew Smith => John, Andrew, Smith
José Mário dos Santos Mourinho Félix => José, Mário, dos Santos, Mourinho, Félix
Working code based on Tim's suggestion:
$str = 'Manuel D\'Souza do Pinto bin Laden Al-saud el Mecca de la Vere Na Sokakah van Der Reidejin del Monte du Pont ter Johannes';
preg_match_all( '~\b(von der|van de|van den|del la|de la|van der|vande|vanden|vander|st|der|des|dela|della|bin|dos|ur|ibn|bint|da|do|le|la|del|du|de|di|el|al|van|von|ter|na|del|san|los)\s+[^\s]+\b|\b[^\s]+~i', $str, $mat );
print_r( $mat );
Result:
Array(
[0] => Array
(
[0] => Manuel
[1] => D'Souza
[2] => do Pinto
[3] => bin Laden
[4] => Al-saud
[5] => el Mecca
[6] => de la Vere
[7] => Na Sokakah
[8] => van Der Reidejin
[9] => del Monte
[10] => du Pont
[11] => ter Johannes
)
[1] => Array
(
[0] =>
[1] =>
[2] => do
[3] => bin
[4] =>
[5] => el
[6] => de la
[7] => Na
[8] => van Der
[9] => del
[10] => du
[11] => ter
)
)