Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mod Rewrite Regex - Multiple Negative Lookaheads

I currently have the working Mod Rewrite Regex:

RewriteEngine On
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^(.*/)?((?:cmd)[^/]*)/((?!(?:cmd)[.+]*)(.+)) $1?$2=$3&%1 [L]

That regex takes the following URL and transforms it into the URL immediately below:

www.site.com/cmd1/param/cmd2/param2/stillparam2 and turn it into www.site.com/index.php?cmd1=param&cmd2=param2/stillparam2

That works fine, but I would also like to create another negative lookahead assertion to ensure that a URL block - ie a /texthere/ param - doesn't include an underscore. An invalid string might look like: www.test.com/cmd/thing/getparam_valuehere; the regex should parse the cmd/thing as a key and value pair and ignore the rest of the string. I would then also write another RewriteRule to have the block of the URL with the underscore in it added as another URL parameter. The following URL translation would occur:

www.test.com/cmd/param1/cmd2/directory/param2/sortorder_5
www.test.com?cmd=param1&cmd2=directory/param2&sortorder=5

Please let me know if I have not been clear enough. Any help would be great.

NB: I have tried using a negative lookahead nested inside the one already present - (?!(?!)) - and have tried using an | on two negative lookaheads, but neither solutions worked. I thought that perhaps something else was more fundamentally wrong?

Thanks all.

Edit: I have also tried the following - which I really thought would work (but obviously, didn't!)

RewriteRule ^(.*/)?((?:cmd)[^/]*)/((?!(?:cmd)[.+]*)(?![.+]*(?:_)[.+]*)(.+)) $1?$2=$3&%1 [L]

That does the following:

www.test.com/cmd/param1/sortorder_1/ translates to www.test.com?cmd=param1/sortorder_1/

When it should instead become: www.test.com?cmd=param1&sortorder=2/. The rule to translate /sortorder_2/ into&sortorder=2 has not yet been created, but you can hopefully see what I mean).

like image 510
pb149 Avatar asked Jul 17 '11 22:07

pb149


1 Answers

After about four days of experimenting, I ended up with a somewhat different solution than I had originally expected to find. I simply removed all the actual URL manipulation to my index.php file and routed all requests through there. Here is my (much cleaner) .htaccess file:

Options +FollowSymlinks
RewriteEngine On
RewriteCond %{QUERY_STRING} (.*)
RewriteRule (.*) index.php?path=$1 [QSA,L]

and here is the block of code I used to parse the entered URL:

preg_match_all('|/([A-Za-z0-9]+)((?!/)[A-Za-z0-9-.]*)|', $_GET['path'], $matches);

        // Remove all '$_GET' parameters from the actual $_GET superglobal:
        foreach($matches[0] as $k => $v) {
            $search = '/' . substr($v, 1);
            $_GET['path'] = str_replace($search, '', $_GET['path'], $count);
        }

        // Add $_GET params to URL args
        for ($i = 0; $i < count($matches[1]); $i++) {
            self::$get_arguments[$matches[1][$i]] = $matches[2][$i];
        }

        // Retrieve all 'cmd' properties from the URL and create an array with them:
        preg_match_all('~(cmd[0-9]*)/(.+?)(?=(?:cmd)|(?:\z))~', $_GET['path'], $matches);

        if (isset($matches[1][0])) {
            return self::$url_arguments = array_combine($matches[1], $matches[2]);

On a URL like this:

http://localhost/frame_with_cms/frame/www/cmd/one/cmd2/two/cmd3/three/cmd4/four/getparam_valuepart1_valuepart2/cmd5/five/

It successfully produces these separate arrays which I then use to handle requests:

Array
(
    [getparam] => valuepart1_valuepart2
)
Array
(
    [cmd] => one/
    [cmd2] => two/
    [cmd3] => three/
    [cmd4] => four/
    [cmd5] => five/
)

Thanks to all who took the time to read and reply.

like image 159
pb149 Avatar answered Sep 28 '22 08:09

pb149