Why is here the sub
eins
with the else
slower than the sub
zwei
with the elsif
?
#!/usr/bin/env perl
use warnings;
use 5.012;
use Benchmark qw(:all);
my $d = 0;
my $c = 2;
sub eins {
if ( $c == 1) {
$d = 1;
}
else {
$d = 2;
}
}
sub zwei {
if ( $c == 1) {
$d = 1;
}
elsif ( $c == 2 ) {
$d = 2;
}
}
sub drei {
$d = 1;
$d = 2 if $c == 2;
}
cmpthese( -5, {
eins => sub{ eins() },
zwei => sub{ zwei() },
drei => sub{ drei() },
} );
Rate eins drei zwei
eins 4167007/s -- -1% -16%
drei 4207631/s 1% -- -15%
zwei 4972740/s 19% 18% --
Rate eins drei zwei
eins 4074356/s -- -8% -16%
drei 4428649/s 9% -- -9%
zwei 4854964/s 19% 10% --
Rate eins drei zwei
eins 3455697/s -- -6% -19%
drei 3672628/s 6% -- -14%
zwei 4250826/s 23% 16% --
Rate eins drei zwei
eins 2832634/s -- -8% -19%
drei 3088931/s 9% -- -12%
zwei 3503197/s 24% 13% --
Rate eins zwei drei
eins 3053821/s -- -17% -26%
zwei 3701601/s 21% -- -10%
drei 4131128/s 35% 12% --
Rate eins drei zwei
eins 3033041/s -- -2% -12%
drei 3092511/s 2% -- -10%
zwei 3430837/s 13% 11% --
Summary of my perl5 (revision 5 version 16 subversion 0) configuration:
Platform:
osname=linux, osvers=3.1.10-1.9-desktop, archname=x86_64-linux
uname='linux linux1 3.1.10-1.9-desktop #1 smp preempt thu apr 5 18:48:38 utc 2012 (4a97ec8) x86_64 x86_64 x86_64 gnulinux '
config_args='-de'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.6.2', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib/../lib64 /usr/lib/../lib64 /lib /usr/lib /lib64 /usr/lib64 /usr/local/lib64
libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
libc=/lib/libc-2.14.1.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.14.1'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'
Characteristics of this binary (from libperl):
Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL
USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE
USE_LOCALE_COLLATE USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
Built under linux
Compiled at May 24 2012 20:53:15
%ENV:
PERL_HTML_DISPLAY_COMMAND="/usr/bin/firefox -new-window %s"
@INC:
/usr/local/lib/perl5/site_perl/5.16.0/x86_64-linux
/usr/local/lib/perl5/site_perl/5.16.0
/usr/local/lib/perl5/5.16.0/x86_64-linux
/usr/local/lib/perl5/5.16.0
.
[ This is an answer per say, but it is useful information that doesn't fit in a comment. ]
First, let's look at the compiled form side by side, If $c == 2
, the execution path of "zwei" is a pure superset of "eins". (Marked with "*".)
*1 <0> enter *1 <0> enter
*2 <;> nextstate(main 4 -e:2) v:{ *2 <;> nextstate(main 4 -e:2) v:{
*3 <#> gvsv[*c] s *3 <#> gvsv[*c] s
*4 <$> const[IV 1] s *4 <$> const[IV 1] s
*5 <2> eq sK/2 *5 <2> eq sK/2
*6 <|> cond_expr(other->7) vK/1 *6 <|> cond_expr(other->7) vK/1
7 <0> enter v 7 <0> enter v
8 <;> nextstate(main 1 -e:3) v:{ 8 <;> nextstate(main 1 -e:3) v:{
9 <$> const[IV 1] s 9 <$> const[IV 1] s
a <#> gvsv[*d] s a <#> gvsv[*d] s
b <2> sassign vKS/2 b <2> sassign vKS/2
c <@> leave vKP c <@> leave vKP
goto d goto d
*e <#> gvsv[*c] s
*f <$> const[IV 2] s
*g <2> eq sK/2
*h <|> and(other->i) vK/1
*e <0> enter v *i <0> enter v
*f <;> nextstate(main 2 -e:6) v:{ *j <;> nextstate(main 2 -e:6) v:{
*g <$> const[IV 2] s *k <$> const[IV 2] s
*h <#> gvsv[*d] s *l <#> gvsv[*d] s
*i <2> sassign vKS/2 *m <2> sassign vKS/2
*j <@> leave vKP *n <@> leave vKP
*d <@> leave[1 ref] vKP/REFC *d <@> leave[1 ref] vKP/REFC
The thing is, I can reproduce your results! (v5.16.0 built for x86_64-linux-thread-multi)
Rate drei eins zwei
drei 8974033/s -- -3% -19%
eins 9263260/s 3% -- -16%
zwei 11034175/s 23% 19% --
Rate drei eins zwei
drei 8971868/s -- -1% -21%
eins 9031677/s 1% -- -20%
zwei 11333871/s 26% 25% --
This isn't a small different (that could be the result of CPU caching), and it's reproduceable between different runs (so it's not another application affecting the benchmark). I'm stumped.
Per iteration, it's taking 22 ns (1/9031677 s - 1/11333871 s) more to do 4 fewer ops. I would expect it to take roughly 100 ns less.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With