how to match arabic word with "tashkel"?

Question

i'm using the following function to highlight certain word and it works fine in english

function highlight(str,toBeHighlightedWord)
     {

        toBeHighlightedWord="(\b"+ toBeHighlightedWord.replace(/([{}()[\]\.?*+^$|=!:~-])/g, "\$1")+ "\b)";
        var r = new RegExp(toBeHighlightedWord,"igm");
        str = str.replace(/(>[^<]+<)/igm,function(a){
            return a.replace(r,"<span color='red' class='hl'>$1</span>");
        });
        return str;
     }

but it dose not for Arabic text

so how to modify the regex to match Arabic words also Arabic words with tashkel, where tashkel is a characters added between the original characters example: "محمد" this without tashkel "مُحَمَّدُ" with tashkel the tashkel the decoration of the word and these little marks are characters

Casimir et Hippolyte · Accepted Answer

In Javascript, you can use the word boundary \b only with these characters: [a-zA-Z0-9_]. A lookbehind assertion can not be useful too here since this feature is not supported by Javascript.

The way to solve the problem and "emulate" a kind of word boundary is to use a negated character class with the characters you want to highlight (since it is a negated character class, it will match characters that can't be part of the word.) in a capturing group for the left boundary. For the right a negative lookahead will be much simple.

toBeHighlightedWord="([^\w\u0600-\u06FF\uFB50-\uFDFF\uFE70-\uFEFF]|^)("
              + toBeHighlightedWord.replace(/([{}()[\]\.?*+^$|=!:~-])/g, "\$1")
              + ")(?![\w\u0600-\u06FF\uFB50-\uFDFF\uFE70-\uFEFF])";
var r = new RegExp(toBeHighlightedWord, "ig");
str = str.replace(/(>[^<]+<)/g, function(a){
    return a.replace(r, "$1<span color='red' class='hl'>$2</span>");
}

Character ranges that are used here come from three blocks of the unicode table:

0600-06FF (Arabic)
FB50-FDFF (Arabic Presentation Forms-A)
FE70-FEFF (Arabic Presentation Forms-B)

Note that the use of a new capturing group changes the replacement pattern.

how to match arabic word with "tashkel"?

Tags:

javascript

regex

arabic

Hager Aly

1 Answers

Casimir et Hippolyte

Recent Activity

Donate For Us

how to match arabic word with "tashkel"?

Tags:

javascript

regex

arabic

Hager Aly

1 Answers

Casimir et Hippolyte

Related questions

Recent Activity

Donate For Us