A Perl idiom for removing duplicate values from an array:
@uniq = keys %{{map{$_=>1}@list}}
Is it cheaper to use this version:
@uniq = keys %{{map{$_=>undef}@list}}
I tested it with these one-liners, and seems that it is true on some versions of Perl:
perl -e 'my %x; $x{$_} = 1 for 0..1000_000; system "ps -ovsz $$"'
perl -e 'my %x; $x{$_} = undef for 0..1000_000; system "ps -ovsz $$"'
Well, undef
is supposed to be a flyweight value, meaning that all references to it point to the same datum. You don't get that for other literals. You still need the overhead of the slot that references it though. However, I'm not seeing it save any memory for me on Perl 5.10 or 5.11 on Mac OS X. While perl
may not be using more memory in the undef
case, I bet it's anticipating using more memory so it grabs it anyway. However, I'm not keen on investigating memory use in the internals right now.
Devel::Peek is pretty handy for showing these sorts of things:
#!perl
use Devel::Peek;
my $a = undef;
my $b = undef;
Dump( $a );
Dump( $b );
my %hash = map { $_, undef } 1 .. 3;
$hash{4} = 'Hello';
Dump( \%hash );
The output looks a bit scary at first, but you see that the undef
values are NULL(0x0)
instead of individual string values (PV
):
SV = NULL(0x0) at 0x100208708
REFCNT = 1
FLAGS = (PADMY)
SV = NULL(0x0) at 0x100208738
REFCNT = 1
FLAGS = (PADMY)
SV = RV(0x100805018) at 0x100805008
REFCNT = 1
FLAGS = (TEMP,ROK)
RV = 0x100208780
SV = PVHV(0x100809ed8) at 0x100208780
REFCNT = 2
FLAGS = (PADMY,SHAREKEYS)
ARRAY = 0x100202200 (0:5, 1:2, 2:1)
hash quality = 91.7%
KEYS = 4
FILL = 3
MAX = 7
RITER = -1
EITER = 0x0
Elt "4" HASH = 0xb803eff9
SV = PV(0x100801c78) at 0x100804ed0
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x100202a30 "Hello"\0
CUR = 5
LEN = 8
Elt "1" HASH = 0x806b80c9
SV = NULL(0x0) at 0x100820db0
REFCNT = 1
FLAGS = ()
Elt "3" HASH = 0xa400c7f3
SV = NULL(0x0) at 0x100820df8
REFCNT = 1
FLAGS = ()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With