I have a rather basic question about regexes.
I use the expression .*
without thinking about it match expecting to match e.g. up to the end of the line. This works.
But for some reason I started thinking about this expression. Checking Wikipedia (my emphasis)
. Matches any single character
* Matches the **preceding** element zero or more times
So now according to this definition, why doesn't .*
try to match the first character in the string 0 or more times but instead tries to apply the match to each character in the string?
I mean if I have abc
it should try to match a,aa,aaa etc
right?
But it does not:
~
$ perl -e '
> my $var="abcdefg";
> $var =~ /(.*)/;
> print "$1\n";'
abcdefg
Confusion starts with the word "element" in Matches the **preceding** element zero or more times
. Term "preceding element" here refers to "preceding pattern" rather than to "preceding capture" (or "preceding match").
This:
.{2,4}
is really shorthand for this:
(..)|(...)|(....)
In the same way, this:
.*
is really shorthand for this:
()|(.)|(..)|(...)| // etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With