When parsing text, I frequently need to implement mini-state-machines, in the generic form following the code below.
Is there a CPAN module that's considered "best practice" and well suited to implement state machine logic like this in an easy and elegant way?
I would prefer solutions less complicated than Parse::RecDescent
but if none exist and Parse::RecDescent
is a lot easier to apply to this problem than I thought, I'm very willing to consider it instead of rolling my own like I've been so far.
Example generic parsing code:
my $state = 1;
while (my $token = get_next_token()) { # Usually next line
if ($state == 1) {
do_state1_processing();
if (token_matches_transition_1_to_2($token)) {
do_state_1_to_2_transition_processing();
$state == 2;
next;
} elsif (token_matches_transition_1_to_4($token)) {
do_state_1_to_4_transition_processing();
$state == 4;
next;
} else {
do_state1_continuation();
next;
}
} elsif ($state == 5) {
do_state5_processing();
if (token_matches_transition_5_to_6($token)) {
do_state_5_to_6_transition_processing();
$state == 6;
next;
} elsif (token_matches_transition_5_to_4($token)) {
do_state_5_to_4_transition_processing();
$state == 4;
next;
} else {
do_state5_continuation();
next;
}
} else {
}
}
I would recommend taking a look at Marpa and Marpa::XS.
Just look at this simple calculator.
my $grammar = Marpa::XS::Grammar->new(
{ start => 'Expression',
actions => 'My_Actions',
default_action => 'first_arg',
rules => [
{ lhs => 'Expression', rhs => [qw'Term'] },
{ lhs => 'Term', rhs => [qw'Factor'] },
{ lhs => 'Factor', rhs => [qw'Number'] },
{ lhs => 'Term', rhs => [qw'Term Add Term'], action => 'do_add' },
{ lhs => 'Factor',
rhs => [qw'Factor Multiply Factor'],
action => 'do_multiply'
},
],
}
);
You will have to implement the tokenizer yourself.
You can use Class::StateMachine:
package Foo;
use parent 'Class::StateMachine';
sub new {
my $class = shift;
Class::StateMachine::bless {}, $class, 'state_1';
}
sub do_state_processing :OnState('state_1') {
my $self = shift;
if (...) { $self->event_1 }
elsif (...) { $self->event_2 }
...
}
sub do_state_processing :OnState('state_2') {
...
}
sub event_1 :OnState('state_1') {
my $self = shift;
$self->state('state_2');
}
sub event_2 :OnState('state_2') {
my $self = shift;
$self->state('state_3');
}
sub enter_state :OnState('state_1') {
print "entering state 1";
...
}
sub enter_state :OnState('state_2') {
...
}
package main;
my $sm = Foo->new;
...
while (my $token = get_next_token()) {
$sm->do_state_processing;
}
Though, a module specific to text processing will probably be more appropriate for your particular case
(With help) I wrote something a few years back called the Perl Formal Language Toolkit so this might serve as some sort of basis, however what I think you really want is a tool like the Ragel Finite State Machine Compiler. Unfortunately, it doesn't output to Perl, and it's a back- burner desire of mine to implement a Perl target for Ragel and to also provide similar (but more Perl oriented) features for my bit-rotting module.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With