Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace English words interleaved with non-English (UTF-8) words

How can I match and replace English words interleaved with Persian words?

The Persian alphabet is not Latin. The problem is that English words interleaved with Persian words (which are written in Right to Left) aren't shown correctly unless they're surrounded with a span that sets the Left to Right direction.

Therefore, I need to replace English words with a <span dir="ltr">word</span>.

I think the following could match Latin words. It should contains some symbols too (#, !, $, …). Also, please provide the expression for replacing

^[a-zA-Z]+( [a-zA-Z]+)*$

To give an example, this text:

من قصد دارم این English# را عوض کنم به

Should be replaced with:

من قصد دارم این <span dir="ltr">English#</span> را عوض کنم به
like image 835
Ahmad Avatar asked Jan 21 '14 06:01

Ahmad


1 Answers

This solves the problem:

$pattern = "/([a-zA-Z]+[a-zA-Z?><;,{}[\]\-_+=!@#$%\^*|']*)/";
$replacement = '<span dir="ltr">${1}</span>';
$subject = preg_replace($pattern, $replacement, $subject);

It matches English alphabet plus some extra characters, but note that you should not include & in the extra characters since the HTML encoding of the Unicode characters begins with &.

like image 195
Ahmad Avatar answered Nov 17 '22 11:11

Ahmad