I'm not a Ruby programmer, but as I was reading through the extensive Ruby on Rails security guide, I noticed this section:
A common pitfall in Ruby’s regular expressions is to match the string’s beginning and end by ^ and $, instead of \A and \z.
Does anyone know if this is this just a matter of aesthetics or something else? I ask because I've only used languages that use ^
and $
.
A regular expression is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings. Ruby regular expressions i.e. Ruby regex for short, helps us to find particular patterns inside a string. Two uses of ruby regex are Validation and Parsing.
=~ is Ruby's pattern-matching operator. It matches a regular expression on the left to a string on the right. If a match is found, the index of first match in string is returned. If the string cannot be found, nil will be returned.
Matches a whitespace character: /[ \t\r\n\f]/. Matches non-whitespace: /[^ \t\r\n\f]/. Matches a single word character: /[A-Za-z0-9_]/. Matches a non-word character: /[^A-Za-z0-9_]/.
\z anchors at the end of the string, \Z anchors at the end of the string or before the last newline, if the string ends with a newline. So, if the string ends with a newline, \Z anchors before that last newline and \z anchors after. – Jörg W Mittag.
This isn't specific to Ruby; \A
and \Z
are not the same thing as ^
and $
. ^
and $
are the start and end of line anchors, whereas \A
and \Z
are the start and end of string anchors.
Ruby differs from other languages in that it automatically uses "multiline mode" (which enables the aforementioned behaviour of having ^
and $
match per line) for regular expressions, but in most other flavours you need to enable it yourself, which is probably why that article contains the warning.
Reference: http://www.regular-expressions.info/anchors.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With