Case Insensitive Unique Array Elements in Perl

Question

I am using the uniq function exported by the module, List::MoreUtils to find the uniq elements in an array. However, I want it to find the uniq elements in a case insensitive way. How can I do that?

I have dumped the output of the Array using Data::Dumper:

#! /usr/bin/perl

use strict;
use warnings;
use Data::Dumper qw(Dumper);
use List::MoreUtils qw(uniq);
use feature "say";

my @elements=<array is formed here>;

my @words=uniq @elements;

say Dumper \@words;

Output:

$VAR1 = [
          'John',
          'john',
          'JohN',
          'JOHN',
          'JoHn',
          'john john'
        ];

Expected output should be: john, john john

Only 2 elements, rest all should be filtered since they are the same word, only the difference is in case.

How can I remove the duplicate elements ignoring the case?

TLP · Accepted Answer

Use lowercase, lc with a map statement:

my @uniq_no_case = uniq map lc, @elements;

The reason List::MoreUtils' uniq is case sensitive is that it relies on the deduping characteristics of hashes, which also is case sensitive. The code for uniq looks like so:

sub uniq {
    my %seen = ();
    grep { not $seen{$_}++ } @_;
}

If you want to use this sub directly in your own code, you could incorporate lc in there:

sub uniq_no_case {
    my %seen = ();
    grep { not $seen{$_}++ } map lc, @_;
}

Explanation of how this works:

@_ contains the args to the subroutine, and they are fed to a grep statement. Any elements that return true when passed through the code block are returned by the grep statement. The code block consist of a few finer points:

$seen{$_}++ returns 0 the first time an element is seen. The value is still incremented to 1, but after it is returned (as opposed to ++$seen{$_} who would inc first, then return).
By negating the result of the incrementation, we get true for the first key, and false for every following such key. Hence, the list is deduped.
grep as the last statement in the sub will return a list, which in turn is returned by the sub.
map lc, @_ simply applies the lc function to all elements in @_.

Case Insensitive Unique Array Elements in Perl

Tags:

perl

uniq

Neon Flash

1 Answers

TLP

Recent Activity

Donate For Us

Case Insensitive Unique Array Elements in Perl

Tags:

perl

uniq

Neon Flash

1 Answers

TLP

Related questions

Recent Activity

Donate For Us