Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tools for automatically simplifying regexes

I'm trying to squash warnings in an open source project, and

/[\.\,\;\:\(\)\[\]\{\}\<\>\"\'\`\~\/\|\?\!\&\@\#\s\x00-\x1f\x7f]+/

is giving me

(irb):1: warning: character class has duplicated range

Are there any tools that automatically point out which parts of the regexp causes the overlap?

like image 691
Andrew Grimm Avatar asked Mar 26 '13 09:03

Andrew Grimm


2 Answers

I don't know of any tool, but I've spotted the overlap: \s contains \t, \f, \n and \r, so that overlaps with the \x00-\x1f part.

So, unless there's a way to get Ruby itself to tell you that it found a "problem", you can write this regex as (removing all those unnecessary backslashes along the way):

/[.,;:()\[\]{}<>"'`~\/|?!&@# \x00-\x1f\x7f]+/
like image 178
Tim Pietzcker Avatar answered Sep 29 '22 11:09

Tim Pietzcker


If you ever reach that point of desperation, I guess you could put outputting some debug info in Ruby source and rebuild. :) I believe this is the place where the warning is thrown:

https://github.com/ruby/ruby/blob/trunk/regparse.c#L1787

like image 28
Mladen Jablanović Avatar answered Sep 29 '22 12:09

Mladen Jablanović