Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex character repeats n or more times in line with grep

Tags:

regex

grep

repeat

I need to find the regex expression to find a character that repeats 4 or more times with grep.

I know that the expression is {n,}, so if I need to find lines, for example, when the character "g" repeats 4 or more times, in theory with grep man page is:

grep "g{4,}" textsamplefile

But doesn't work. Any help?

The character could have other letters. For example, a valid match is:

gexamplegofgvalidgmatchg

gothergvalidgmatchgisghereg

ggggother

like image 319
Goncatin Avatar asked Dec 21 '17 08:12

Goncatin


1 Answers

you should change your grep command in:

grep -E 'g{4,}' input_file # --> this will extract only the lines containing chains of 4 or more g

if you want to take all the lines that contain chains of 4 or more identical characters your regex become:

grep -E '(.)\1{3,}' input_file

If you do not need the chains but only line where g appear 4 or more times:

grep -E '([^g]*g){4}' input_file

you can generalize to any char repeating 4 times or more by using:

grep -E '(.)(.*\1){3}' input_file
like image 116
Allan Avatar answered Sep 21 '22 11:09

Allan