Question

I need to replace all cyrillic characters to the latin equivalents between "[]" brackets. Here is the example:

Приметимо да
формула (\ref{ј5121}) обухвата
и случајеве а) и б).
Заиста, из (\ref{ј5121}), за $x_1=x_2$
добија се:
\[
|АБ|=\sqrt{(y_2-у_1)^2}=|y_2-п_1|,
\]
а из (\ref{ј5121}), за $y_1=y_2$:
\[
|AЦ|=\sqrt{(м_2-х_1)^2}=|н_2-x_1|.
\]

Стога се формула (\ref{ј5121}) може
применити на било које
двe тачке, без обзира
на њихов положај.

I've managed to isolate the content between the brackets with this code: $pattern = "/[([^)]*)]/"; preg_match_all($pattern, $string, $output);

But I just can't make it replace cyrillic characters with latin ones :\ Any kind of help is welcome. Thanks!

Was it helpful?

Solution

You can use this:

$data = <<<'LOD'
Приметимо да
формула (\ref{ј5121}) обухвата
и случајеве а) и б).
Заиста, из (\ref{ј5121}), за $x_1=x_2$
добија се:
\[
|АБ|=\sqrt{(y_2-у_1)^2}=|y_2-п_1|,
\]
а из (\ref{ј5121}), за $y_1=y_2$:
\[
|AЦ|=\sqrt{(м_2-х_1)^2}=|н_2-x_1|.
\]
LOD;

$pattern = '~(?<=\[)[^]]++(?=])~u';

$result = preg_replace_callback($pattern, function ($m) {
    $cyrillic2latin = array(
        'A'=>'A', 'Б'=>'B', 'В'=>'V', 'Г'=>'G', 'Д'=>'D', 'Е'=>'E',
        'Ё'=>'YO', 'Ж'=>'ZH', 'З'=>'Z', 'И'=>'I', 'Й'=>'J', 'К'=>'K',
        'Л'=>'L', 'М'=>'M', 'Н'=>'N', 'О'=>'O', 'П'=>'P', 'Р'=>'R',
        'С'=>'S', 'Т'=>'T', 'У'=>'U', 'Ф'=>'F', 'Х'=>'H', 'Ц'=>'TS',
        'Ч'=>'CH', 'Ш'=>'SH', 'Щ'=>'SHCH', 'Ъ'=>'\'', 'Э'=>'E', 'Ю'=>'YU',
        'Я'=>'YA');
    return strtr($m[0], $cyrillic2latin);
}, $data);
print_r($result);

feel free to correct it and add lowercase letters!

if you use PHP >= 5.4.0, you can change the function to:

$result = preg_replace_callback($pattern, function ($m) {
    return transliterator_transliterate("Cyrillic-Latin", $m[0]);
}, $data);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top