Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is substr-lvalue faster than four-arg substr?

From this question, we benchmark these two variants,

substr( $foo, 0, 0 ) = "Hello ";
substr( $foo, 0, 0, "Hello " );

In it we discover that substr-lvalue is faster. To which Ikegami said,

How is 4-arg substr slower than lvalue substr (which must create a magical scalar, and requires extra operations)??? – ikegami

Truth be told, I also assumed that it would be massively slower and just mentioned it because it was brought up by someone else. Purely for curiosity,

Why is substr-lvalue faster than four-arg substr in the above usecase?

like image 432
NO WAR WITH RUSSIA Avatar asked Jan 26 '23 02:01

NO WAR WITH RUSSIA


2 Answers

It was simply a bad benchmark result.

When I replicated your results, I was using perl on Unbuntu on Windows Susbsytem for Linux. Let's just say that performance is sensitive to external factors on that system.

Even when using a native build for Windows (Strawberry Perl) on the same computer, I get wild differences in the results:

                   Rate        substr substr_valute   multiconcat
                  Rate substr_valute        substr   multiconcat
substr_valute 6997958/s            --           -0%          -27%
substr        7007667/s            0%            --          -26%
multiconcat   9533733/s           36%           36%            --

                   Rate        substr substr_valute   multiconcat
substr        6795650/s            --           -0%          -10%
substr_valute 6805545/s            0%            --          -10%
multiconcat   7526593/s           11%           11%            --

                    Rate        substr substr_valute   multiconcat
substr         7513339/s            --          -22%          -28%
substr_valute  9693997/s           29%            --           -6%
multiconcat   10367639/s           38%            7%            --

                    Rate        substr   multiconcat substr_valute
substr         8791152/s            --          -13%          -14%
multiconcat   10139954/s           15%            --           -1%
substr_valute 10240638/s           16%            1%            --

The times are just so small, and the machine is just too busy to get accurate readings.

(There's a point to be made about micro-optimizations in there somewhere...)

I hate running benchmarks on my shared linux web host, but it normally produces far more consistent results. Today was no exception.

                   Rate        substr substr_valute   multiconcat
substr        4293130/s            --           -3%          -13%
substr_valute 4407446/s            3%            --          -11%
multiconcat   4938717/s           15%           12%            --

                   Rate substr_valute        substr   multiconcat
substr_valute 4289732/s            --           -2%          -16%
substr        4356113/s            2%            --          -15%
multiconcat   5096889/s           19%           17%            --

(I used -3 instead of 100_000_000.)

All differences are 3% or less, which isn't significant. As far as I can tell, one isn't slower than the other.

In fact, one shouldn't expect any difference. As pointed out by Dave Mitchell, substr( $foo, 0, 0 ) = "Hello "; is optimized into something virtually equivalent to substr( $foo, 0, 0, "Hello " ); since 5.16 (with an improvement in 5.20).

$ perl -MO=Concise,-exec -e'substr( $foo, 0, 0, "Hello " );'
1  <0> enter
2  <;> nextstate(main 1 -e:1) v:{
3  <#> gvsv[*foo] s
4  <$> const[IV 0] s
5  <$> const[IV 0] s
6  <$> const[PV "Hello "] s
7  <@> substr[t2] vK/4
8  <@> leave[1 ref] vKP/REFC
-e syntax OK

$ perl -MO=Concise,-exec -e'substr( $foo, 0, 0 ) = "Hello ";'
1  <0> enter
2  <;> nextstate(main 1 -e:1) v:{
3  <$> const[PV "Hello "] s
4  <#> gvsv[*foo] s
5  <$> const[IV 0] s
6  <$> const[IV 0] s
7  <@> substr[t2] vKS/REPL1ST,3
8  <@> leave[1 ref] vKP/REFC
-e syntax OK

(The only difference is the order in which the operands are passed, which is signaled using the REPL1ST flag.)

like image 102
ikegami Avatar answered Jan 27 '23 14:01

ikegami


Since 5.16.0, the lvalue+assign variant has been optimised into the 4-arg variant (although the nulled-out NOOP assignment op was still in the execution path until 5.20.0, which slowed it down slightly).

like image 42
Dave Mitchell Avatar answered Jan 27 '23 14:01

Dave Mitchell