Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Foreach on hash variables in Perl

Tags:

foreach

hash

perl

I am new to Perl scripting and have a doubt on foreach on hash variables. I want to print all values of my hash. Here's a program:

%colors = (a => 1, b=>2, c=>3, d=>4, e=>5);
foreach $colors(keys %colors)
{
    print "$colors{%colors} \n";
}

The output is:

5
3
1
2
4

Why are the values sorted randomly? Or what's the logic behind this randomness?? Please clarify my doubt.

like image 712
Pratik Kulkarni Avatar asked Nov 30 '22 03:11

Pratik Kulkarni


2 Answers

I think that your confusion lies in not knowing exactly what a Hash is. Most languages have something analogous to a key-value store, in Ruby and Perl they are called Hashes, in Java Maps, in Python dictionaries, etc...

They are all essentially the same thing, you insert a value with a unique key into some underlying data structure to gain direct access to it at the cost of memory.

So what actually happens when you add a key and a value to a hash?

Hashes are built around the idea of hash functions which take some value as input computes a unique output (ideally every input has their own unique output). If two inputs both map to the same output, this is called a collision.

Now we are at the point where we need to talk about how the Hash is implemented, the two classic examples are with a single array or an array of linked-lists. I will show the array example below.

Array

In the simple array case the data structure underlying the Hash is just an array of some size. The hashing function is used to compute an index into that array. If we assume a simple hashing algorithm

h(x) = length(x) % ARRAY_SIZE

here x is a string and ARRAY_SIZE is the size of our underlying array, this statement will make sure that all values x will fall in the range 0..ARRAY_SIZE - 1

To look at a visual example consider an array of size 5:

   0     1     2     3     4
 ------------------------------
|     |     |     |     |      |
 ------------------------------

and assume we are trying to store the value 5 using key abcd, according to our hashing algorithm

h('abcd') = length('abcd') % ARRAY_SIZE
          =      4         %      5
          =      4

So the value 5 will be stored at index 4:

   0     1     2     3     4
 ------------------------------
|     |     |     |     |   5  |
 ------------------------------

now what would happen if we were to try to store the value 3 using the key dcba, the two keys are different right? They should map to different places.

h('dcba') = length('dcba') % ARRAY_SIZE
          =      4         %      5
          =      4

Oops! This key also maps to index 4 so what are we going to do now? Well we can't just throw away the key-value pair because the programmer obviously needs/wants this pairing in their Hash so we need to decide what to do in the event of a collision. There are many algorithms that do this, but the simplest one is to look for the next open slot in the array and store 3 their. So now our array looks like:

   0     1     2     3     4
 ------------------------------
|   3  |     |     |     |  5  |
 ------------------------------

This was not an extremely in-depth explanation, but hopefully it will give some insight into why retrieving values from Hashes seems random, its because the underlying data structure constantly changes, if you were to ask for the keys from your Hash right now you would probably get back (3, 5), even though you inserted 5 first, only because 3 occurs first in the array.

Hope this was helpful.

like image 118
Hunter McMillen Avatar answered Dec 05 '22 06:12

Hunter McMillen


Quoting perldata - Perl data types:

Hashes are unordered collections of scalar values indexed by their associated string key.

You can sort the keys, or, if you want to preserve the order given in the initialization, use Tie::Hash::Indexed or Tie::IxHash.

like image 26
choroba Avatar answered Dec 05 '22 06:12

choroba