Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where to define local temp variables in Perl subroutine?

Tags:

scope

perl

I took too long to use warnings; and strict; in Perl, but now I did, I see the advantages.

One of the things I'm still not sure about is when to define a temporary variable. This may seem like a trivial thing, but I run a lot of Monte Carlo simulations where losing a bit of time adds up over 10000+ iterations. I've been lazy about using strict/warnings on quicker simulations, but they've gotten more complex, so I really need to.

So (cutting out code to calculate stuff) I am wondering if

sub doStuff
{
  my $temp;
  for my $x (1..50)
  {
    $temp = $x**2;
  }
  for my $x (1..50)
  {
    $temp = $x**3;
  }
}

Or

sub doStuff
{
  for my $x (1..50)
  {
    my $temp = $x**2;
  }
  for my $x (1..50)
  {
    my $temp = $x**3;
  }
}

Is less/more efficient, or if one violates some Perl coding I didn't know yet.

like image 253
aschultz Avatar asked Dec 14 '22 03:12

aschultz


2 Answers

The efficiency between these two is close enough, and it is dwarfed by any realistic processing. So I'd go by code – if the $tmp is indeed temporary and unneeded after the loop then it is better to keep it inside (scoped), for all the other reasons.

Since this is about optimization I'd like to digress. Such micro-issues may have an effect. However, where you really gain is first at the level of algorithms, and then by choosing data structures and techniques suitably. The low-level tweaks are the very last thing to think about, and there are often language features and libraries that render them irrelevant. That said, one should know one's tool and not waste time around.

Also, there is often a trade-off between the code clarity and efficiency. If it comes to that I suggest to code for correctness and clarity. Then profile and optimize if needed, cautiously and gradually, and with a lot of testing in between.

Here is a comparison, as an example of basic use of the core module Benchmark. I throw in an additional operation and add other cases where there is no temporary.

use warnings 'all';
use strict;    
use Benchmark qw(cmpthese);

my $x;

sub tmp_in {
    for (1..10_000) {
        my $tmp = 2 * $_;
        $x = $tmp + $_;
    }
    return $x;
}

sub tmp_out {
    my $tmp;
    for (1..10_000) {
        $tmp = 2 * $_;
        $x = $tmp + $_;
    }
    return $x;
}

sub no_tmp {
    for (1..10_000) { $x = 2 * $_ + $_ }
    return $x;
}

sub base {
    for (1..10_000) { $x += $_ }
    return $x;
}

sub calc { 
    for (1..10_000) { $x += sin sqrt(rand()) }
    return $x;
}         

cmpthese(-10, {
    tmp_in  => sub { tmp_in  },
    tmp_out => sub { tmp_out },
    no_tmp  => sub { no_tmp  },
    base    => sub { base    },        
    calc    => sub { calc    },
});

Output (on v5.16)

          Rate    calc  tmp_in tmp_out  no_tmp    base
calc     623/s      --    -11%    -26%    -44%    -59%
tmp_in   698/s     12%      --    -17%    -37%    -54%
tmp_out  838/s     34%     20%      --    -25%    -44%
no_tmp  1117/s     79%     60%     33%      --    -26%
base    1510/s    142%    116%     80%     35%      --

So they differ, and apparently a declaration in a loop costs. But tmp versions are together in the list. Also, this is often just overhead so it is greatly exaggerated. And there are other aspects – no_tmp runs in one statement, for example. These things may matter only if your processing is mostly iterations. Just generating a (high quality) pseudo-random number is expensive.

This may also differ (wildly) across different hardware and software versions. My results with v5.10 on a better machine are a bit different. Replace the sample 'calculations' with your processing, and run on the actual hardware, for a relevant measure of whether it matters at all.

like image 133
zdim Avatar answered Jan 11 '23 07:01

zdim


Personally I would keep the temporary variable in the for loop. Just because that is where it is used. The other way, at some point down the line it will come back to bite you (or the person who has to pick up your code) with an unexpected value.

Also premature optimization is an anti-pattern

Optimization can reduce readability and add code that is used only to improve the performance. This may complicate programs or systems, making them harder to maintain and debug. As a result, optimization or performance tuning is often performed at the end of the development stage.

like image 20
KeepCalmAndCarryOn Avatar answered Jan 11 '23 07:01

KeepCalmAndCarryOn