Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it better to check Perl hash keys for truth or for existence?

Tags:

hash

perl

Is it more preferrable, when assigning to a hash of just keys (where the values aren't really needed), to say:

$hash{$new_key} = "";

Or to say:

$hash{$new_key} = 1;

One necessitates that you check for a key with exists, the other allows you to say either:

if (exists $hash{$some_key})

or

if ($hash{$some_key})

I would think that assigning a 1 would be better, but are there any problems with this? Does it even matter?

like image 719
Joe Avatar asked Sep 02 '09 23:09

Joe


4 Answers

It depends on whether you need the key to exist or to have a true value. Test for the thing you need. If you are using a hash merely to see if something is in a list, exists() is the way to go. If you are doing something else, checking the value might be the way to go.

like image 64
brian d foy Avatar answered Nov 18 '22 07:11

brian d foy


When the values aren't needed, you'll often see this idiom:

my %exists;
$exists{$_}++ for @list;

Which has the effect of setting it to be 1.

like image 42
Chris Simmons Avatar answered Nov 18 '22 07:11

Chris Simmons


If you're trying to save memory (which generally only matters if you have a very large hash), you can use undef as the value and just test for its existence. Undef is implemented as a singleton, so thousands of undefs are all just pointers to the same value. Setting each value to the empty string or 1 would allocate a different scalar value for each element.

my %exists;
@exists{@list} = ();

In light of your later comment about your intended use, this is the idiom I've seen and used many times:

my %seen;
while (<>) {
    next if $seen{$_}++; # false the first time, true every successive time
    ...process line...
}
like image 9
KingPong Avatar answered Nov 18 '22 08:11

KingPong


Assume you actually needed to check existence of keys, but you wrote code that checks for truth. It checks for truth throughout your program in various places. Then it suddenly appears that you misunderstood something and you should actually store a mapping from your keys to string values; the strings should be used in the same dataflow as you've already implemented.

And the strings can be empty!

Hence you should either refactor your program or create another hash, because truth checks no longer check existence. That wouldn't happen if you checked for existence from the very beginning.

(edited coz dunno why got voted down.)

like image 3
P Shved Avatar answered Nov 18 '22 06:11

P Shved