Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP-REGEX: accented letters matches non-accented ones, and vice versa. How to achieve this?

I want to do typical highlight code. So I have something like:

$valor = preg_replace("/(".$_REQUEST['txt_search'].")/iu", "<span style='background-color:yellow; font-weight:bold;'>\\1</span>", $valor);

Now, the request word could be something like "josé". And with it, I want "jose" or "JOSÉ" or "José" etc highlighted too.

With this expression, if I write "josé", it matches "josé" and "JOSÉ" (and all the case variants). It always matches the accented variants only. If I search "jose", it matches "JOSE", "jose", "Jose" but not the accented ones. So I've partially what I want, cause I have case insensitive on accented and non-accented separately.

I need it fully combined, wich means accent (unicode) insensitive, so I can search "jose", and highlight "josé", "josÉ", "José", "JOSE", "JOSÉ", "JoSé", ...

I don't want to do a replace of accents on the word, cause when I print it on screen I need to see the real word as it comes.

Any ideas?

Thanks!

like image 587
Lightworker Avatar asked Dec 21 '22 21:12

Lightworker


2 Answers

You can try to make a function to create your regex expression based on your txt_search, replacing any possible match to all possible matches like this:

function search_term($txt_search) {
    $search = preg_quote($txt_search);

    $search = preg_replace('/[aàáâãåäæ]/iu', '[aàáâãåäæ]', $search);
    $search = preg_replace('/[eèéêë]/iu', '[eèéêë]', $search);
    $search = preg_replace('/[iìíîï]/iu', '[iìíîï]', $search);
    $search = preg_replace('/[oòóôõöø]/iu', '[oòóôõöø]', $search);
    $search = preg_replace('/[uùúûü]/iu', '[uùúûü]', $search);
    // add any other character

    return $search;
}

Then you use the result as a regex on your preg_replace.

like image 137
Fernando Avatar answered Dec 28 '22 08:12

Fernando


You might have to parse the search string, and modify the pattern in the regex so that if includes cases like [eéÉ]. Replace all instances of e/E/é/É with a catch-all [eEéÉ]. Do the same for all other cases. So in your example the search pattern, instead of Jose/José/JOSÉ, would be jos[éÉeE]

like image 30
dda Avatar answered Dec 28 '22 07:12

dda