Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance with Perl Strings

I've been running across a lot of Perl code that breaks long strings up this way:

my $string = "Hi, I am a very long and chatty string that just won't";
$string .= " quit.  I'm going to keep going, and going, and going,";
$string .= " kind of like the Energizer bunny.  What are you going to";
$string .= " do about it?";

From my background with Java, building a string like this would be a performance no-no. Is the same true with Perl? In my searches, I have read that using join on an array of strings is the fastest way to concatenate strings, but what about when you just want to break up a string for readability? Is it better to write:

my $string = "Hi, I am a very long and chatty string that just won't" .
    " quit.  I'm going to keep going, and going, and going," .
    " kind of like the Energizer bunny.  What are you going to" .
    " do about it?";

Or do I use join, or how should it be done?

like image 521
justkt Avatar asked Jun 23 '10 18:06

justkt


4 Answers

Camel book, p 598:

Prefer join("", . ..) to a series of concatenated strings. Multiple concatenations may cause strings to be copied back and forth multiple times. The join operator avoids this.

like image 105
Justin R. Avatar answered Nov 04 '22 12:11

Justin R.


One more thing to add to this thread that hasn't been mentioned yet -- if you can, avoid joining/concatenating these strings. Many methods will take a list of strings as arguments, not just one string, so you can just pass them individually, e.g.:

print "this is",
    " perfectly legal",
    " because print will happily",
    " take a list and send all the",
    " strings to the output stream\n";

die "this is also",
    " perfectly acceptable";

use Log::Log4perl :easy; use Data::Dumper;
INFO("and this is just fine",
    " as well");

INFO(sub {
    local $Data::Dumper::Maxdepth = 1;
    "also note that many libraries will",
    " accept subrefs, in which you",
    " can perform operations which",
    " return a list of strings...",
    Dumper($obj);
 });
like image 43
Ether Avatar answered Nov 04 '22 12:11

Ether


I made the benchmark ! :)

#!/usr/bin/perl

use warnings;
use strict;

use Benchmark qw(cmpthese timethese);

my $bench = timethese($ARGV[1], {

  multi_concat => sub {
    my $string = "Hi, I am a very long and chatty string that just won't";
    $string .= " quit.  I'm going to keep going, and going, and going,";
    $string .= " kind of like the Energizer bunny.  What are you going to";
    $string .= " do about it?";
  },

  one_concat => sub {
    my $string = "Hi, I am a very long and chatty string that just won't" .
    " quit.  I'm going to keep going, and going, and going," .
    " kind of like the Energizer bunny.  What are you going to" .
    " do about it?";
  },

  join => sub {
    my $string = join("", "Hi, I am a very long and chatty string that just won't",
    " quit.  I'm going to keep going, and going, and going,",
    " kind of like the Energizer bunny.  What are you going to",
    " do about it?"
    );
  },

} );

cmpthese $bench;

1;

The results (on my iMac with Perl 5.8.9):

imac:Benchmarks seb$ ./strings.pl 1000
Benchmark: running join, multi_concat, one_concat for at least 3 CPU seconds...
      join:  2 wallclock secs ( 3.13 usr +  0.01 sys =  3.14 CPU) @ 3235869.43/s (n=10160630)
multi_concat:  3 wallclock secs ( 3.20 usr + -0.01 sys =  3.19 CPU) @ 3094491.85/s (n=9871429)
one_concat:  2 wallclock secs ( 3.43 usr +  0.01 sys =  3.44 CPU) @ 12602343.60/s (n=43352062)
                   Rate multi_concat         join   one_concat
multi_concat  3094492/s           --          -4%         -75%
join          3235869/s           5%           --         -74%
one_concat   12602344/s         307%         289%           --
like image 45
sebthebert Avatar answered Nov 04 '22 12:11

sebthebert


The main performance difference between your two examples is that in the first, the concatenation happens each time the code is called, whereas in the second, the constant strings will be folded together by the compiler.

So if either of these examples will be in a loop or function called many times, the second example will be faster.

This assumes the strings are known at compile time. If you are building up the strings at runtime, as fatcat1111 mentions, the join operator will be faster than repeated concatenation.

like image 3
Eric Strom Avatar answered Nov 04 '22 12:11

Eric Strom