Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Zend Search Lucene and Accented Characters

I'm trying to find a way in Zend_Search_Lucene to pull off the following scenario:

Let's say we have a user and her name is Aïcha (note the special character). If I'm searching the index for Aicha (without the special derivative of i), I'd like for Aïcha to be returned in the results.

Is there something special I need to do when indexing or searching in order to make this work? I've read solutions about normalizing the data before indexing, replacing all special characters with normalized characters, but I'd rather not go that route.

Thanks in advance, Gary

like image 993
Gary M Avatar asked Jul 27 '10 19:07

Gary M


1 Answers


function normalize ($string){
    $a = 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ
ßàáâãäåæçèéêëìíîïðñòóôõöøùúûýýþÿŔŕ';
    $b = 'aaaaaaaceeeeiiiidnoooooouuuuy
bsaaaaaaaceeeeiiiidnoooooouuuyybyRr';
    $string = utf8_decode($string);
    $string = strtr($string, utf8_decode($a), $b);
    $string = strtolower($string);
    return utf8_encode($string);
}
$passToIndexer = normalize(" Aïcha ");

try to use this functions output while creating the index, store the actual value without indexing it =) hope it helps, I Frankly dont think there is any other way.

like image 173
Abdullah Khan Avatar answered Sep 30 '22 11:09

Abdullah Khan