Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing a string character by character in Perl

Tags:

perl

I want to parse a string character by character. I am using perl to do that. Is there any way where we can start from the first character of the string and then loop character by character. Right now I have split the string into an array and I am loo[ping through the array.

$var="junk shit here. fkuc lkasjdfie.";
@chars=split("",$var);

But instead of spliting the wholes string before itself, is there any descriptor which would point to the first character of the string and then traverse each character? Is there any way to do this?

like image 541
Ashwin Avatar asked Apr 27 '14 06:04

Ashwin


People also ask

How do you split a character in Perl?

Perl | split() Function. split() is a string function in Perl which is used to split or you can say to cut a string into smaller sections or pieces. There are different criteria to split a string, like on a single character, a regular expression(pattern), a group of characters or on undefined value etc..

How do I split a string by space in Perl?

Perl Articles How can we split a string in Perl on whitespace? The simplest way of doing this is to use the split() function, supplying a regular expression that matches whitespace as the first argument.

What does split function do in Perl?

Description. This function splits a string expression into fields based on the delimiter specified by PATTERN. If no pattern is specified whitespace is the default. An optional limit restricts the number of elements returned.


2 Answers

my $var = "junk sit here. fkuc lkasjdfie.";

while ($var =~ /(.)/sg) {
   my $char = $1;
   # do something with $char 
}

or

for my $i (1 .. length $var) {
  my $char = substr($var, $i-1, 1);
}

and when bench-marked, substr method is better performing than while,

use Benchmark qw( cmpthese ) ;
my $var = "junk sit here. fkuc lkasjdfie." x1000;

cmpthese( -5, {
    "while" => sub{
      while ($var =~ /(.)/sg) {
         my $char = $1;
         # do something with $char 
      }
    },
    "substr" => sub{
      for my $i (1 .. length $var) {
        my $char = substr($var, $i-1, 1);
      }
    },
});

result

         Rate  while substr
while  56.3/s     --   -53%
substr  121/s   114%     --
like image 139
mpapec Avatar answered Oct 04 '22 01:10

mpapec


This can be the skeleton of the script/regex:

use strict;
use warnings;
use Data::Dumper qw(Dumper);

my $str = "The story of Dr. W. Fletcher who is a dentist. The hero of the community.";

my @sentences = split /(?<!(Dr| \w))\./, $str;
print Dumper \@sentences;

And the output is:

$VAR1 = [
      'The story of Dr. W. Fletcher who is a dentist',
      undef,
      ' The hero of the community'
    ];
like image 40
szabgab Avatar answered Oct 04 '22 01:10

szabgab