Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ignore accented characters while sorting in php in multidimensional array [duplicate]

I have multidimensional array as shown below in which I want to do sorting on the basis of [name] field. Also, accented letters should sort as though they are unaccented.

Array
(
    [chicago] => Array
        (
            [community_name] => Chicago, IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => HELLO WORLD.
                                )
                        )

                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Hello
                                )

                        )

                    [2] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Administration.
                                )
                        )
                )

        )

    [chicago-and-surrounding-areas] => Array
        (
            [community_name] => Chicago (and surrounding areas), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Covit Corp. 
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Câble-Axion Digital Corp. 
                                )
                        )   
                )

        )

    [cambridge-chicago] => Array
        (
            [community_name] => Cambridge (Chicago), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Avocados.
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Aṕple.
                                )
                        )   
                )

        )

)

This is what I want to achieve:

Array
(
    [chicago] => Array
        (
            [community_name] => Chicago, IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Administration.
                                )
                        )

                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => HELLO WORLD. 
                                )

                        )

                    [2] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Hello
                                )
                        )
                )

        )

    [chicago-and-surrounding-areas] => Array
        (
            [community_name] => Chicago (and surrounding areas), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Câble-Axion Digital Corp.
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Covit Corp. 
                                )
                        )   
                )

        )

    [cambridge-chicago] => Array
        (
            [community_name] => Cambridge (Chicago), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Aṕple.
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Avocados.
                                )
                        )   
                )

        )

)

This is what I have tried but I am wondering if its gonna work in all cases. In some cases even after sorting accented letters rank lower than their non-accented counterparts.

I am wondering what changes I should make in the code below so that accented letters should sort as though they are unaccented.

foreach ($array as &$locality) {
    usort($locality['areas'], function ($a, $b) {
        // return $a['name'][0] <=> $b['name'][0];
        return iconv('UTF-8', 'ISO-8859-8//TRANSLIT', $a['name'][0]) <=> iconv('UTF-8', 'ISO-8859-8//TRANSLIT', $b['name'][0]);
    });
}
like image 696
user1950349 Avatar asked Dec 06 '25 14:12

user1950349


2 Answers

Use intl's Collator:

$arr = [
  ['key' => 'Avocado'],
  ['key' => 'Aṕple'],
];

$c = new Collator('root');
usort(
    $arr,
    function($a, $b) use($c){
        return $c->compare($a['key'], $b['key']);
    }
);
var_dump($arr);

Output:

array(2) {
  [0]=>
  array(1) {
    ["key"]=>
    string(7) "Aṕple"
  }
  [1]=>
  array(1) {
    ["key"]=>
    string(7) "Avocado"
  }
}

Where 'root' uses a set of default rules that appear to disregard accents as desired, though you can specify actual locales for language-specific sort orders.

like image 55
Sammitch Avatar answered Dec 08 '25 03:12

Sammitch


You can use Normalizer to split chars from diacritics and remove them after to get the 'base'-chars.

function stripDiacritics(string $string): string {
    return preg_replace(
        '/[\x{0300}-\x{036f}]/u',
        '',
        Normalizer::normalize($string , Normalizer::FORM_D)
    );
}

foreach ($array as &$locality) {
    usort($locality['areas'], function ($a, $b) {
        return stripDiacritics($a['name'][0]) <=> stripDiacritics($b['name'][0]);
    });
}    

Working example.

Strip from here.

Next time use var_export, so we can use your array to test the code :)

List of diacritics (source of \x{0300}-\x{036f}).

like image 28
SirPilan Avatar answered Dec 08 '25 03:12

SirPilan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!