I want to be able to parse file paths like this one:
/var/www/index.(htm|html|php|shtml)
into an ordered array:
array("htm", "html", "php", "shtml")
and then produce a list of alternatives:
/var/www/index.htm
/var/www/index.html
/var/www/index.php
/var/www/index.shtml
Right now, I have a preg_match
statement that can split two alternatives:
preg_match_all ("/\(([^)]*)\|([^)]*)\)/", $path_resource, $matches);
Could somebody give me a pointer how to extend this to accept an unlimited number of alternatives (at least two)? Just regarding the regular expression, the rest I can deal with.
The rule is:
The list needs to start with a (
and close with a )
There must be one |
in the list (i.e. at least two alternatives)
Any other occurrence(s) of (
or )
are to remain untouched.
Update: I need to be able to also deal with multiple bracket pairs such as:
/var/(www|www2)/index.(htm|html|php|shtml)
sorry I didn't say that straight away.
Update 2: If you're looking to do what I'm trying to do in the filesystem, then note that glob() already brings this functionality out of the box. There is no need to implement a custom solutiom. See @Gordon's answer below for details.
$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.
Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.
Throw in an * (asterisk), and it will match everything. Read more. \s (whitespace metacharacter) will match any whitespace character (space; tab; line break; ...), and \S (opposite of \s ) will match anything that is not a whitespace character.
I think you're looking for:
/(([^|]+)(|([^|]+))+)/
Basically, put the splitter '|' into a repeating pattern.
Also, your words should be made up 'not pipes' instead of 'not parens', per your third requirement.
Also, prefer +
to *
for this problem. +
means 'at least one'. *
means 'zero or more'.
Not exactly what you are asking, but what's wrong with just taking what you have to get the list (ignoring the |s), putting it into a variable and then explode
ing on the |s? That would give you an array of however many items there were (including 1 if there wasn't a | present).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With