Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing of a logical expression and converting it to a tree in Perl

I have complex logical expression that look like this:

((((!((cond1) || (cond2) || (cond3)) && (cond4))) && (cond5)) <= (((cond6) || (cond7) || (cond8)) || (cond9)))

Each line has several dozens of expressions. The allowed logical signs are ||, &&, ! and <=. <= means leads, as in a <= bmeans that b leads to a.

I need to go over those statements and check the conditions, since some of them are no longer valid. I want to be able to parse it to a tree, then check each of it leafs (where each leaf is a condition), delete the unwanted leafs and build the full and correct expressions back.

I know that each node of the tree is defined by a pair of the first bracket and the bracket that closes it, but I don't know how to identify such pairs and how to identify the logical sign between them.

All signs except for ! come between two expressions.

like image 619
SIMEL Avatar asked Dec 23 '10 10:12

SIMEL


1 Answers

Sounds like a case for Parse::RecDescent:

use strict;
use warnings;
use Parse::RecDescent;

my $text = '((((!((cond1) || (cond2) || (cond3)) && (cond4))) && (cond5)) <= (((cond6) || (cond7) || (cond8)) || (cond9)))';

#$::RD_TRACE=1;

my $grammar = q{

startrule: expr

expr: operand operation(s?)
    { $return = @{$item[2]} ? { 'operations' => $item[2], 'lvalue' => $item[1] } : $item[1] }

operation: /\|\||&&|<=/ operand
    { $return = { 'op' => $item[1], 'rvalue' => $item[2] } }

operand: '(' expr ')'
    { $return = $item[2] }

operand: '!' operand
    { $return = { 'op' => '!', 'value' => $item[2] } }

operand: /\w+/

};

my $parser = Parse::RecDescent->new($grammar);
my $result = $parser->startrule($text) or die "Couldn't parse!\n";

use Data::Dumper;
$Data::Dumper::Indent = 1;
$Data::Dumper::Sortkeys = 1;
print Dumper $result;

The grammar, in English:

The whole thing is an expression. An expression is an operand followed by zero or more binary operators and their operands. Each operand is either a parenthesized expression, '!' followed by an operand, or a word (e.g. cond1).

Every node in the produced tree is in one of the following forms:

  • cond1 - a condition
  • { 'op' => '!', 'value' => 'node' } - ! applied to another node
  • { 'lvalue' => 'node', 'operations' => [ one or more of: { 'op' => 'binop', 'rvalue' => 'node' } ] } - series of one or more operations representing node binop node binop node ...

I didn't break series of binary operations (e.g. ((cond1) || (cond2) || (cond3))) into a binary tree because you provided no information about precedence or associativity.

Output for your example is:

$VAR1 = {
  'lvalue' => {
    'lvalue' => {
      'lvalue' => {
        'op' => '!',
        'value' => {
          'lvalue' => 'cond1',
          'operations' => [
            {
              'op' => '||',
              'rvalue' => 'cond2'
            },
            {
              'op' => '||',
              'rvalue' => 'cond3'
            }
          ]
        }
      },
      'operations' => [
        {
          'op' => '&&',
          'rvalue' => 'cond4'
        }
      ]
    },
    'operations' => [
      {
        'op' => '&&',
        'rvalue' => 'cond5'
      }
    ]
  },
  'operations' => [
    {
      'op' => '<=',
      'rvalue' => {
        'lvalue' => {
          'lvalue' => 'cond6',
          'operations' => [
            {
              'op' => '||',
              'rvalue' => 'cond7'
            },
            {
              'op' => '||',
              'rvalue' => 'cond8'
            }
          ]
        },
        'operations' => [
          {
            'op' => '||',
            'rvalue' => 'cond9'
          }
        ]
      }
    }
  ]
};
like image 146
ysth Avatar answered Sep 19 '22 15:09

ysth