Regex matching specific value after a certain number of tabs

Question

In a tab delimited text file, I would like to match only lines containing the "1" value right after the 24th tab.
Right now, the regex I have seems to match what I want, but breaks when the line doesn't match.
Could you help me improving it?

My regex :

/(?:.+?	){24}1/

Sample input :

INT E_63    0   0   u   Le  Le  DET:ART DET le  ??  ADJ SENT DET:ART NOM ADV    SENT DET NOM    1   ??  ??  ??  ??  ??  0   0   0   0   0   1   ??  ??  ??  ??  ??  ??  
INT E_63    0   0   u   Le  Le  DET:ART DET le  ??  ADJ SENT DET:ART NOM ADV    SENT DET NOM    1   ??  ??  ??  ??  ??  0   0   0   0   0   0   ??  ??  ??  ??  ??  ??

(The first line should match, the second should not.)

Wiktor Stribiżew · Accepted Answer

Your regex does not work when there is no match due to catastrophic backtracking as . also matches a tab character. Coupled with the fact that there are more subpatterns after the group with nested quantifiers, and absence of the ^ anchor, the catastrophic backtracking is imminent.

What you need is a negated character class [^ ] and anchor the pattern at the start of the string:

/^(?:[^	]*	){24}1/

See the regex demo.

NOTE: To match the 1 as a whole word, you might consider adding \b after it, or a lookahead (?!\S).

Details:

^ - start of a string
(?:[^ ]* ){24} - 24 sequences of
- [^ ]* - 0+ chars other than a tab char
- - a tab char
1 - a 1 char.

Chankey Pathak · Answer

Instead of using regex you could just split it, check the 24th column at 23rd index and then use conditionals.

#!/usr/bin/perl
use strict;
use warnings;

open (my $fh, "<", '/path/to/tab_delem_file') or die "Could not open file $!";

while(<$fh>){
  chomp;
  my @line = split/	/, $_; #split on tab
  if ($line[23] == 1){
      #do something
  }
  else ($line[23] == 1){
      #do something else
  }
}

Regex matching specific value after a certain number of tabs

Tags:

regex

perl

My regex :

Sample input :

Azaghal

2 Answers

Wiktor Stribiżew

Chankey Pathak

Recent Activity

Donate For Us

Regex matching specific value after a certain number of tabs

Tags:

regex

perl

My regex :

Sample input :

Azaghal

2 Answers

Wiktor Stribiżew

Chankey Pathak

Related questions

Recent Activity

Donate For Us