I don't understand why join
changes the output of JSON::to_string
in the following example:
#!/usr/bin/perl
use v5.26;
use Data::Dumper;
use JSON;
my @version = (1, 2, 3, 4);
say "version: ", join ".", @version; # comment this line out
$Data::Dumper::Terse = 1;
$Data::Dumper::Indent = 0;
say Dumper(\@version);
say to_json(\@version);
The output with the line containing join
:
version: 1.2.3.4
[1,2,3,4]
["1","2","3","4"]
But commenting out the line with join
the output of to_json
suddenly shows integers instead of strings although the output of Data::Dumper
is still the same:
[1,2,3,4]
[1,2,3,4]
When you stringify a number, the stringification is stored in the scalar along with the origin number. (You can see a demonstration at the bottom of my answer.)
When you numify a string, the numification is stored in the scalar along with the origin number.
This is an optimization since one often stringify or numify a scalar more than once.
This isn't a problem for Perl since Perl has coercing operators rather than polymorphic operators. But it puts the authors of JSON serializers in the difficult positions of either requiring additional information or guessing which of the values a scalar contains should be used.
You can force a number using $x = 0 + $x;
.
You can force a string using $x = "$x";
.
More detailed answer follows.
Perl is free to change internal format a scalar as it sees fit. This is usually done as part of modifying the scalar.
$x = 123; # $x contains a signed integer
$x += 0.1; # $x contains a float
$x = 2147483647; # $x contains a signed integer
++$x; # $x contains an unsigned integer (on a build with 32-bit ints)
$x = "123"; # $x contains a downgraded string
$x += 0; # $x contains a signed integer
$x = "abc"; # $x contains a downgraded string
$x .= "\x{2660}"; # $x contains an upgraded string
But sometimes, Perl adds a second value to an scalar as an optimization.
$x = 123; # $x contains a signed integer
$x * 0.1; # $x contains a signed integer AND a float
$x = 123; # $x contains a signed integer
"$x"; # $x contains a signed integer AND a downgraded string
$x = "123"; # $x contains a downgraded string
$x+0; # $x contains a signed integer AND a downgraded string
These aren't the only double (or triple) vars you'll encounter.
my $x = !!0; # $x contains a signed integer AND a float AND a downgraded string
"$!"; # $! contains a float (not a signed integer?!) AND a downgraded string
This isn't a problem in Perl because we use type-coercing operators (e.g. ==
works on numbers, eq
works on strings). But many other languages rely on polymorphic operators (e.g. ==
can be used to compare strings and to compare numbers).[1]
But it does present a problem for JSON serializers which are forced to assign a single type to a scalar. If $x
contains both a string a number, which one should be used?
If the scalar is the result of stringification, using the number would be ideal, but if the scalar is the result of numification, the string would be ideal. There's no way to tell which of these origins pertains to a scalar (if any), so the module's author was left with a tough choice.
Ideally, they would have provided a different interface, but that could have added complexity and a performance penalty.
You can view the internals of a scalar using Devel::Peek's Dump
. The relevant line is the FLAGS
line.
IOK
without IsUV
: contains a signed integerIOK
with IsUV
: contains an unsigned integerNOK
: contains a floatPOK
without UTF8
: contains a downgraded stringPOK
with UTF8
: contains an upgraded stringROK
: contains a reference$ perl -MDevel::Peek -e'$x=123; Dump($x); "$x"; Dump($x);' 2>&1 |
perl -M5.014 -ne'next if !/FLAGS/; say join ",", /\b([INPR]OK|IsUV|UTF8)/g'
IOK
IOK,POK
$ perl -MDevel::Peek -e'$x="123"; Dump($x); 0+$x; Dump($x);' 2>&1 |
perl -M5.014 -ne'next if !/FLAGS/; say join ",", /\b([INPR]OK|IsUV|UTF8)/g'
POK
IOK,POK
Well, Perl doesn't have separate operators for the different numeric types, which can cause issues (e.g. -0 exists a float, but not as an int), but these problems are seldom encountered.
Another issue is that the stringification of floats often results in a loss of information.
This is one of the very few times where you must maintain data purity in Perl. Once you create a variable of some type, you must never use it in a context of any other type. If you do need to, copy it to a new variable first to preserve the original.
use feature 'say';
use Data::Dumper;
use JSON;
my @version = (1, 2, 3, 4);
{ say "version: ", join ".", my @copy = @version; }
$Data::Dumper::Terse = 1;
$Data::Dumper::Indent = 0;
say Dumper(\@version);
say to_json(\@version);
Prints:
version: 1.2.3.4
[1,2,3,4]
[1,2,3,4]
I would also recommend using Cpanel::JSON::XS because this is one area where pedantism is called for! It tries pretty hard to get the data types right. It also has some discussion of the conversion issue.
HTH
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With