Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl Overloading Weirdness

Long story short: we want to mark strings so that later we can do something with them, even if they get embedded in other strings.

So we figured, hey, let's try overloading. It is pretty neat. I can do something like:

my $str = str::new('<encode this later>');
my $html = "<html>$str</html>";
print $html; # <html><encode this later></html>
print $html->encode; # <html>&lt;encode this later&gt;</html>

It does this by overloading the concatenation operator to make a new object array with the plain string "<html>", the object wrapping "<encode this later>", and the plain string "</html>". It can nest these arbitrarily. On encode, it will leave the plain strings, but encode the object strings. But if you stringify the object, it just spits it all out as plain strings.

This works well, except that in some cases, it stringifies for no apparent reason. The script below shows the behavior, which I've duplicated in 5.10 through 5.22.

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper; $Data::Dumper::Sortkeys=1;

my $str1 = str::new('foo');
my $str2 = str::new('bar');

my $good1 = "$str1 $str2";
my $good2;
$good2 = $good1;
my($good3, $good4);
$good3 = "$str1 a";
$good4 = "a $str1";

my($bad1, $bad2, $bad3);
$bad1 = "a $str1 a";
$bad2 = "$str1 $str2";
$bad3 = "a $str1 a $str2 a";

say Dumper { GOOD => [$good1, $good2, $good3], BAD => [$bad1, $bad2, $bad3] };

$bad1 = ''."a $str1 a";
$bad2 = ''."$str1 $str2";
$bad3 = ''."a $str1 a $str2 a";
say Dumper { BAD_GOOD => [$bad1, $bad2, $bad3] };


package str;
use Data::Dumper; $Data::Dumper::Sortkeys=1;

use strict;
use warnings;
use 5.010;

use Scalar::Util 'reftype';

use overload (
    '""'        => \&stringify,
    '.'         => \&concat,
);

sub new {
    my($value) = @_;
    bless((ref $value ? $value : \$value), __PACKAGE__);
} 

sub stringify {
    my($str) = @_;
    #say Dumper { stringify => \@_ };
    if (reftype($str) eq 'ARRAY') {
        return join '', @$str;
    }
    else {
        $$str;
    }
}

sub concat {
    my($s1, $s2, $inverted) = @_;
    #say Dumper { concat => \@_ };
    return new( $inverted ? [$s2, $s1] : [$s1, $s2] );
}

1;

I want all of these to be dumped as objects, not strings. But the "BAD" examples are all stringified. All of the "BAD" examples are when I'm assigning a string object I am concatenating at the moment to a variable previously declared. If I declare at the same time, or concatenate the strings previously, or add in an extra concatenation (beyond the interpolated string concat), then it works fine.

This is nuts.

The result of the script:

$VAR1 = {
    'BAD' => [
        'a foo a',
        'foo bar',
        'a foo a bar a'
    ],
    'GOOD' => [
        bless( [
            bless( [
                bless( do{\(my $o = 'foo')}, 'str' ),
                ' '
            ], 'str' ),
            bless( do{\(my $o = 'bar')}, 'str' )
        ], 'str' ),
        $VAR1->{'GOOD'}[0],
        bless( [
            $VAR1->{'GOOD'}[0][0][0],
            ' a'
        ], 'str' )
    ]
};

$VAR1 = {
    'BAD_GOOD' => [
        bless( [
            '',
            bless( [
                bless( [
                    'a ',
                    bless( do{\(my $o = 'foo')}, 'str' )
                ], 'str' ),
                ' a'
            ], 'str' )
        ], 'str' ),
        bless( [
            '',
            bless( [
                bless( [
                    $VAR1->{'BAD_GOOD'}[0][1][0][1],
                    ' '
                ], 'str' ),
                bless( do{\(my $o = 'bar')}, 'str' )
            ], 'str' )
        ], 'str' ),
        bless( [
            '',
            bless( [
                bless( [
                    bless( [
                        bless( [
                            'a ',
                            $VAR1->{'BAD_GOOD'}[0][1][0][1]
                        ], 'str' ),
                        ' a '
                    ], 'str' ),
                    $VAR1->{'BAD_GOOD'}[1][1][1]
                ], 'str' ),
                ' a'
            ], 'str' )
        ], 'str' )
    ]
};

The behavior makes no sense to me. I'd like to understand why it works this way, and I'd like to find a workaround.

like image 572
pudge Avatar asked Jun 23 '18 05:06

pudge


1 Answers

Well it ain't a great solution and doesn't answer why perl does this, but I got something... I left a few debugging print statements in there.

For whatever reason perl thinks you want to convert the scalar reference to your object to a scalar string. You can trick it into not doing this by adding a reference to the reference, then dereferencing it.

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper; $Data::Dumper::Sortkeys=1;
use Scalar::Util 'reftype';

my $str1 = str::new('foo');
my $str2 = str::new('bar');

say 'good1';
my $good1 = "$str1 $str2";
say 'g1 ', reftype($good1);
say Dumper $good1;

say 'bad1';
my $bad1;
say 'b1 ', reftype($bad1);
$bad1 = "$str1 $str2";
say 'b2 ', reftype($bad1);
say Dumper $bad1;

say 'workaround';
my $workaround;
say 'w1 ', reftype($workaround);
$workaround = ${\"$str1 $str2"};
say 'w2 ', reftype($workaround);
say Dumper $workaround;


package str;
use Data::Dumper; $Data::Dumper::Sortkeys=1;

use strict;
use warnings;
use 5.010;

use Scalar::Util 'reftype';

use overload (
    '""'        => \&stringify,
    '.'         => \&concat,
);

sub new {
    my ($value) = @_;
    bless((ref $value ? $value : \$value), __PACKAGE__);
} 

sub stringify {
    my ($str) = @_;

    say "stringify";
    say reftype($str);

    if (reftype($str) eq 'ARRAY') {
        say scalar @$str;
        return join '', @$str;
    }
    else {
        $$str;
    }
}

sub concat {
    my ($s1, $s2, $inverted) = @_;

    say "concat";
    say reftype($s1);
    say reftype($s2);
    say reftype($inverted);

    return new( $inverted ? [$s2, $s1] : [$s1, $s2] );
}

1;

$workaround gets you the following

$VAR1 = bless( [
                 bless( [
                          bless( do{\(my $o = 'foo')}, 'str' ),
                          ' '
                        ], 'str' ),
                 bless( do{\(my $o = 'bar')}, 'str' )
               ], 'str' );
like image 147
Nathan Loyer Avatar answered Nov 04 '22 05:11

Nathan Loyer