Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ripgrep Missing Character Class + Repetitions

Tags:

regex

ripgrep

Why do these match:

echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | grep -E 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2}C'
echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | rg 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{1,2}C'
echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | rg 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2,}C'
echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | awk '$0 ~ /CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2}C/'

But this does not:

echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | rg 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2}C'

I was under the impression that ripgrep used rust regex engine, which should be able to handle the character class + repetition?

like image 608
Stats4224 Avatar asked Jul 05 '19 16:07

Stats4224


Video Answer


1 Answers

This is due to a bug (issue 1319) in ripgrep which was fixed in version 12.0.0.

like image 138
Ian Avatar answered Oct 11 '22 23:10

Ian