Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Space in regex character class producing weird results

Tags:

regex

So I was working on some regex and came across some weird behavior in regex.

I had a character class in the regex that included a bunch of characters (alphanumeric) and ended with a space, a dash, and a plus. The weird behavior is reproducable using the following regex.

^[ -+]*$

So what happens is that a space is valid text input and so is the plus. However, for some reason the dash is not valid text input. The regex can be fixed by rearranging the charaters in the class as so:

^[ +-]*$

Now all the characters are valid input. This has been reproduced in Chrome using jsFiddle and also using Expresso.

My question is basically, am I doing something wrong or is this just weird? :)

like image 876
gislikonrad Avatar asked Jan 11 '23 09:01

gislikonrad


1 Answers

The - character has special meaning inside character classes. When it appears between two characters, it creates a range, e.g. [0-9] matches any character between 0 and 9, inclusive. However, when placed at the start or the end of the character class (or when escaped) it represents a literal - character.

  • [ -+] will match any character between a space (char code 32) and a + (char code 43), inclusive.
  • [ +-] will match a space (char code 32), a + (char code 43), or a - (char code 45)
like image 198
p.s.w.g Avatar answered Jan 28 '23 05:01

p.s.w.g