The following code:
#!/usr/bin/env perl
use utf8;
use strict;
use warnings;
use 5.012; # implicitly turn on feature unicode_strings
my $test = "some string";
$test =~ m/.+\x{2013}/x;
Yields:
Use of uninitialized value
$test
in pattern match(m//)
at test.pl line 9.
This seems to happen with any 2-byte character inside \x{}
. The following regexes work fine:
/a+\x{2013}/
/.*\x{2013}/
/.+\x{20}/
Also, the error goes away with use bytes
, but using that pragma is discouraged. What's going on here?
This was a bug, and has now been fixed in blead by commits 7e0d5ad7c9cdb21b681e611b888acd41d34c4d05 and c72077c4fff72b66cdde1621c62fb4fd383ce093. This fix should be available in 5.17.5
It is singular that you should ask this question. I looks related to a bug that I just reported yesterday
https://rt.perl.org/rt3/Ticket/Display.html?id=114808
where this code also produces "Use of uninitialized value $_ in split ..."
warnings, and causes split
to unexpectedly return an empty list:
use warnings;
binmode *STDOUT, ":encoding(UTF-8)";
my $pattern = "\x{abc}\x{def}ghi";
for ( "\x{444}", "norm\x{a0}l", "\x{445}", "ab\x{ccc}de\x{fff}gh" ) {
print "--------------------\ntext is $_, pattern is /$pattern/\n";
# expect split to return ($_) , but when $pattern and $_ both
# have wide chars, it returns ()
print 'split output is [', split /$pattern/, $_;
print "]\n";
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With