Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract data between square brackets "[]" using Perl

I was using a regex for extracting data from curved brackets (or "parentheses") like extracting a,b from (a,b) as shown below. I have a file in which every line will be like

this is the range of values (a1,b1) and [b1|a1]
this is the range of values (a2,b2) and [b2|a2]
this is the range of values (a3,b3) and [b3|a3]

I'm using the following string to extract a1,b1, a2,b2, etc...

@numbers = $_ =~ /\((.*),(.*)\)/

However, if I want to extract the data from square brackets [], how can I do it? For example

this is the range of values (a1,b1) and [b1|a1]
this is the range of values (a1,b1) and [b2|a2]

I need to extract/match only the data in square brackets and not the curved brackets.

like image 512
Naidu Avatar asked Nov 27 '22 09:11

Naidu


2 Answers

[Update] In the meantime, I've written a blog post about the specific issue with .* I describe below: Why Using .* in Regular Expressions Is Almost Never What You Actually Want


If your identifiers a1, b1 etc. never contain commas or square brackets themselves, you should use a pattern along the lines of the following to avoid backtracking hell:

/\[([^,\]]+),([^,\]]+)\]/

Here's a working example on Regex101.

The issue with greedy quantifiers like .* is that you'll very likely consume too much in the beginning so that the regex engine has to do extensive backtracking. Even if you use non-greedy quantifiers, the engine will do more attempts to match than necessary because it'll only consume one character at a time and then try to advance the position in the pattern.

(You could even use atomic groups to make the matching even more performant.)

like image 146
Marius Schulz Avatar answered Dec 06 '22 15:12

Marius Schulz


#!/usr/bin/perl
# your code goes here
my @numbers;
while(chomp(my $line=<DATA>)){
    if($line =~ m|\[(.*),(.*)\]|){
    push @numbers, ($1,$2);
    }
}
print @numbers; 
__DATA__
this is the range of values [a1,b1]
this is the range of values [a2,b2]
this is the range of values [a3,b3]

Demo

like image 36
Chankey Pathak Avatar answered Dec 06 '22 15:12

Chankey Pathak