Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Perl, what is the sane way for converting a string into a list of its characters?

Tags:

string

split

perl

I have been wondering if there's a nicer, but concise way for splitting a string into its characters

@characters = split //, $string

is not that hard to read, but somehow the use of a regular expression looks like overkill to me.

I have come up with this:

@characters = map { substr $string, $_, 1 } 0 .. length($string) - 1

but I find it uglier and less readable. What is your preferred way of splitting that string into its characters?

like image 894
hillu Avatar asked Mar 01 '10 14:03

hillu


People also ask

How do I split a string into a character in Perl?

If you need to split a string into characters, you can do this: @array = split(//); After this statement executes, @array will be an array of characters. split recognizes the empty pattern as a request to make every character into a separate array element.

What does !~ Mean in Perl?

!~ is the negation of the binding operator =~ , like != is the negation of the operator == . The expression $foo !~ /bar/ is equivalent, but more concise, and sometimes more expressive, than the expression !($foo =~ /bar/)

How do I split a string into multiple strings in Perl?

A string is splitted based on delimiter specified by pattern. By default, it whitespace is assumed as delimiter. split syntax is: Split /pattern/, variableName.

How do I find a character in a string in Perl?

Using the Perl index() function The index() function is used to determine the position of a letter or a substring in a string. For example, in the word "frog" the letter "f" is in position 0, the "r" in position 1, the "o" in 2 and the "g" in 3. The substring "ro" is in position 1.


1 Answers

Various examples, and speed comparisons.

I thought it might be a good idea to see how fast some of the ways are to split a string on every character.

I ran the test against several versions of Perl that I happen to have on my computer.

test.pl

use 5.010;
use Benchmark qw(:all) ;
my %bench = (
   'split' => sub{
     state $string = 'x' x 1000;
     my @chars = split //, $string;
     \@chars;
   },
   'split-string' => sub{
     state $string = 'x' x 1000;
     my @chars = split '', $string;
     \@chars;
   },
   'split-capture' => sub{
     state $string = 'x' x 1000;
     my @chars = split /(.)/, $string;
     \@chars;
   },
   'unpack' => sub{
     state $string = 'x' x 1000;
     my @chars = unpack( '(a)*', $string );
     \@chars;
   },
   'match' => sub{
     state $string = 'x' x 1000;
     my @chars = $string =~ /./gs;
     \@chars;
   },
   'match-capture' => sub{
     state $string = 'x' x 1000;
     my @chars = $string =~ /(.)/gs;
     \@chars;
   },
   'map-substr' => sub{
     state $string = 'x' x 1000;
     my @chars = map { substr $string, $_, 1 } 0 .. length($string) - 1;
     \@chars;
   },
);
# set the initial state of $string
$_->() for values %bench;
cmpthese( -10, \%bench );
for perl in /usr/bin/perl /opt/perl-5.10.1/bin/perl /opt/perl-5.11.2/bin/perl;
do
  $perl -v | perl -nlE'if( /(v5\.\d+\.\d+)/ ){
    say "## Perl $1";
    say "<pre>";
    last;
  }';
  $perl test.pl;
  echo -e '</pre>\n';
done

Perl v5.10.0

               Rate split-capture match-capture map-substr match unpack split split-string
split-capture 296/s            --          -20%       -20%  -23%   -58%  -63%         -63%
match-capture 368/s           24%            --        -0%   -4%   -48%  -54%         -54%
map-substr    370/s           25%            0%         --   -3%   -48%  -53%         -54%
match         382/s           29%            4%         3%    --   -46%  -52%         -52%
unpack        709/s          140%           93%        92%   86%     --  -11%         -11%
split         793/s          168%          115%       114%  107%    12%    --          -0%
split-string  795/s          169%          116%       115%  108%    12%    0%           --

Perl v5.10.1

               Rate split-capture map-substr match-capture match unpack split split-string
split-capture 301/s            --       -31%          -41%  -47%   -60%  -65%         -66%
map-substr    435/s           45%         --          -14%  -23%   -42%  -50%         -50%
match-capture 506/s           68%        16%            --  -10%   -32%  -42%         -42%
match         565/s           88%        30%           12%    --   -24%  -35%         -35%
unpack        743/s          147%        71%           47%   32%     --  -15%         -15%
split         869/s          189%       100%           72%   54%    17%    --          -1%
split-string  875/s          191%       101%           73%   55%    18%    1%           --

Perl v5.11.2

               Rate split-capture match-capture match map-substr unpack split-string split
split-capture 300/s            --          -28%  -32%       -38%   -59%         -63%  -63%
match-capture 420/s           40%            --   -5%       -13%   -42%         -48%  -49%
match         441/s           47%            5%    --        -9%   -39%         -46%  -46%
map-substr    482/s           60%           15%    9%         --   -34%         -41%  -41%
unpack        727/s          142%           73%   65%        51%     --         -10%  -11%
split-string  811/s          170%           93%   84%        68%    12%           --   -1%
split         816/s          171%           94%   85%        69%    12%           1%    --

As you can see split is the quickest, owing to the fact that this is a special case in the code for split.

split-capture is the slowest, probably because it has to set $1, along with several other match variables.

So I would recommend going with plain old split //, ..., or the roughly equivalent split '', ....

like image 113
Brad Gilbert Avatar answered Nov 15 '22 07:11

Brad Gilbert