Question

How can I match and replace English words interleaved with Persian words?

The Persian alphabet is not Latin. The problem is that English words interleaved with Persian words (which are written in Right to Left) aren't shown correctly unless they're surrounded with a span that sets the Left to Right direction.

Therefore, I need to replace English words with a <span dir="ltr">word</span>.

I think the following could match Latin words. It should contains some symbols too (#, !, $, …). Also, please provide the expression for replacing

^[a-zA-Z]+( [a-zA-Z]+)*$

To give an example, this text:

من قصد دارم این English# را عوض کنم به

Should be replaced with:

من قصد دارم این <span dir="ltr">English#</span> را عوض کنم به
Was it helpful?

Solution

This solves the problem:

$pattern = "/([a-zA-Z]+[a-zA-Z?><;,{}[\]\-_+=!@#$%\^*|']*)/";
$replacement = '<span dir="ltr">${1}</span>';
$subject = preg_replace($pattern, $replacement, $subject);

It matches English alphabet plus some extra characters, but note that you should not include & in the extra characters since the HTML encoding of the Unicode characters begins with &.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top