Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Declaring variables inside or outside loops in Perl, best practices

I've been working on some Perl libraries for data mining. The libraries are full of nested loops for gathering and processing information. I'm working with strict mode and I always declare my variables with my outside of the first loop. For instance:

# Pretty useless code for clarity purposes:

my $flag = 1;
my ($v1, $v2);

while ($flag) {
  for $v1 (1 .. 1000) {

    # Lots and lots of code...

    $v2 = $v1 * 2;
  }
}

For what I've read here, performance-wise, it is better to declare them outside of the loop, however, the maintenance of my code is becoming increasingly difficult because the declaration of some variables ends up pretty far away from where they are actually used.

Something like this would be easier to mantain:

my $flag = 1;

while ($flag) {
  for my $v1 (1 .. 1000) {

    # Lots and lots of code...

    my $v2 = $v1 * 2;
  }
}

I don't have much of experience with Perl since I come from working mostly with C++. At some point, I would like to open source most of my libraries, so I would like them to be as pleasing for all of the Perl gurus as possible.

From a professional Perl developer point of view, what is most appropriate choice between these options?

like image 422
calvillo Avatar asked Jan 24 '14 22:01

calvillo


2 Answers

The general rule is to declare every variable as late as possible.

If the value of a variable doesn't need to be kept across iterations of a loop then declare it inside the loop, or as the loop control variable for a for loop.

If it needs to remain static across the loop iterations (like your $flag) then declare it immediately before the loop.

Yes, there is a minimal speed cost to be paid if you discard and reallocate a variable every time a block is executed, but programming and maintenance costs are by far the most important efficiency and should always be put first.

You shouldn't be optimising your code before it has been made to work and found to be running too slowly; and even then, moving declarations to the top of the file is a long way down the list of compromises that are likely to make a useful difference.

like image 148
Borodin Avatar answered Oct 20 '22 01:10

Borodin


Optimize for readability. This means declaring variables in the smallest possible scope. Ideally, I can see the variable declaration and all usages of that variable at the same time. We can only keep a very limited amount of context in our heads, so declaring variables near their use makes it easier to understand, write, and debug code.

Understanding what variant performs better is difficult to estimate, and difficult to measure as the effect will be rather small. But if performance is roughly equivalent, we might as well use the more readable variant.

I personally often try to write code in a single assignment form where variables aren't reassigned, and mutators like push @array, $elem are avoided. This makes sure that the name of a variable and its value are always interchangeable which makes it easier to reason about code. This implies that each variable declaration is also an initialization, which removes a whole class of errors.

like image 21
amon Avatar answered Oct 20 '22 01:10

amon