Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl takes a long time to evaluate: keys %hash / iterate through a large hash

Tags:

hashmap

hash

perl

In a Perl script, I build up a large hash (around 10 GB) which takes about 40 minuets, which has around 100 million keys. Next I want to loop through the keys of the hash, like so:

foreach my $key (keys %hash) {

However this lines takes 1 hour and 20 minuets to evaluate! Once in the for-loop the code runs through the whole hash at a quick pace.

Why does entering the forloop take so long? And how can I speed the process up?

like image 909
joshlk Avatar asked Dec 06 '22 02:12

joshlk


2 Answers

foreach my $key (keys %hash) {

This code will create a list that includes all the keys in %hash first, and you said your %hash is huge, then that will take a while to finish. Especially if you start swapping memory to disk because you ran out of real memory.

You could use while (my ($key, $value) = each %hash) { to iterate over that hash, and this one will not create that huge list. If you were swapping, this will be much faster since you won't be anymore.

like image 179
Lee Duhem Avatar answered Jan 09 '23 19:01

Lee Duhem


There are two approaches to iterate over hash, both having their pros and cons.

Approach 1:

foreach my $k (keys %h)
{
  print "key: $k, value: $h{$k}\n";
}

Pros:

  • It is possible to sort the output by key.

Cons:

  • It creates a temporary list to hold the keys, in case your hash is very large you end up using lots of memory resources.

Approach 2:

while ( ($k, $v) = each %h )
{
  print "key: $k, value: $h{$k}\n";
}

Pros:

  • This uses very little memory as every time each is called it only returns a pair of (key, value) element.

Cons:

  • You can't order the output by key.
  • The iterator it uses belongs to %h. If the code inside the loop calls something that does keys %h, values %h or each %h, then the loop won't work properly, because %h only has 1 iterator
like image 45
jaypal singh Avatar answered Jan 09 '23 21:01

jaypal singh