Pergunta

Let say we have random string like this:

$str_test = "faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f";

and we do some preg_replace function on it:

preg_replace("/[^\da-z ]/i", "_", $str_test);

And the result I get is:

faaf__ __ ______ ____ ______ ______ ____ fa fssfa af__ af__sa f

So if we compare bothe - input and output:

faaf__ __ ______ ____ ______ ______ ____ fa fssfa af__ af__sa f
faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f

we can see that all special chars are being replaced with two signt "_" ... Result should be:

faaf_ _ ___ __ ___ ___ __ fa fssfa af_ af_sa f
faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f

I have tried with encodings already but no success.. I also thought to make function to do multiple preg_match once and than replace "_" with "" ... but that would be slow on big texts ...

Any Ideas?

Foi útil?

Solução

$str=preg_replace("/[^0-9a-zA-Z ]/u", "_", $str_test);

Notice 'u' modifier! Explanation: http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php#107498

If the _subject_ contains utf-8 sequences the 'u' modifier should be set, otherwise a pattern such as /./ could match a utf-8 *sequence as two to four individual ASCII characters*.

Outras dicas

Why not use the build in php multibyte functions?

mb_ereg_replace is the one to use here. Manual

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top