Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP URL Param redirects - with wildcards/regex

I recently found this solution for doing php url variable to header location redirects.

It's so much more manageable compared to htaccess for mass redirects, however one thing I want to next work out, how I can use regex to achieve what you can do with htaccess where request/(.*) goes to destination/$1.

My grasping is on that you use preg_match or preg_replace or something. How can I achieve something like the below, preferably keeping it short like this if possible. (I know this is wrong btw, just for example sake).

preg_match($redirects['request(.*)'] = "$domain/destination/\1");

Basically to break it down, say I want to redirect doma.in/pics to domain.tld/pictures, I have htaccess redirect that to doma.in/index?req=pics, where index is the script file and req is the parameter used.

And then using the script I have a line like $redirects['pics'] = "$domain/pictures";, where $domain variable is tied to http://domain.tld.

That works great, however I want to take this a step further with regex and send anything pics/*stuff*, aka doma.in/index?req=pics/*stuff* to $domain/pictures/*stuff*.

Here's an example of how things look using this script for doing many redirects.

$redirects['request'] = "$domain/dest";
$redirects['request2'] = "$domain/dest2";
$redirects['request3'] = "$domain/dest3";

Even though I've linked the post at the top that I got the script I'm using, here's the code:

if(isset($_GET['req']) && isset($redirects[$_GET['req']])) {
    $loc = htmlspecialchars($redirects[$_GET['req']]);
    header("Location: " . $loc);
    exit();
}
    header("Location: $domain");

With the $redirects lines being included above this, which I have in included files.

like image 349
Jase Wolf - theythem Avatar asked Mar 12 '26 17:03

Jase Wolf - theythem


1 Answers

This is a long read, plus it is messy way of doing this. See my accepted answer for a much better way to do this.

I thought that ltrim() was what I wanted, seeing on other answers that if for example I specify 0 as what to remove, 01 will become 1, 001 will become 01, and 10 will be left as 10, 100 as 100 and so on. However this was not turning out to be the case. and instead it would strip all instances of the stated characters. Though it wasn't doing it with the slash, so confused.

This however does it correctly:

if (strpos($get, $req) === 0) {
    $get = substr($get, strlen($req));
}
return $get;

Thanks to this answer for this one liner.

All I'm doing here with this script is just assigning $redirects['request'] to the associated value, like with any other variable value assignments. And the $_GET['req'] already does the job to get well whatever the parameter is, so no complicated preg or regex or anything.

So with that substr(), we can take the $_GET['req'] and do the following:

$req = "pics/";
$get = $_GET['req'];
$wild = strpos($get, $req) === 0
    ? substr($get, strlen($req))
    : $get;

$redirects["pics/$wild"] = "$domain/pictures/$wild";

This takes pics/*stuff* and removes the pics/ so the value of $wild equals just *stuff*, and so I just use that in the redirect to make a wildcard and taadaa.

This is completely functional, but let's make this even better to save remembering this code each time which is a fair bit.

Create a function like this above the redirects:

function wildcard($req) {
    $get = $_GET['req'];
    return strpos($get, $req) === 0
        ? substr($get, strlen($req))
        : $get;
}

By calling wildcard('pics/');, the $req equals pics/.

We can use this in redirects like:

$req = "pics/";
$wild = wildcard($req);
$redirects[$req.$wild] = "$domain/pictures/$wild";

It's still a bit more than I hoped for, so the idea I've had is to call $req as a global in the function, like this:

function wild() {
    $get = $_GET['req']; global $req;
    return strpos($get, $req) === 0
        ? substr($get, strlen($req))
        : $get;
}

And then do the redirect like:

$req = "pics/";
$redirects[$req.wild()] = "$domain/pictures/".wild();

That becomes a much shorter single line. Though with the conflict around using globals, I've just put it back to as before but instead of repeatedly assigning $wild, just put $req back inside wild() and have it be like:

$req = "pics/";    $redirects[$req.wild($req)] = "$domain/pictures/".wild($req);

It's still shorter anyway and isn't much to it over the brackets being empty.

P.S, This method, you want to include the trailing slash on the parameter so results don't get messy. In order to achieve to be able to send pics to the $domain/pictures, we want to have a trailing slash at the end of the parameter. In your redirect rule in htaccess to send requests as a parameter to the script, add a trailing slash on the end. So if you're using Apache or Litespeed, you can do the following in htaccess to send all requests to your script as a parameter with the trailing slash like:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?req=$1/ [R=301,L]

Make sure this is at the bottom so it doesn't take priority over other rules.

Also added a precautionary rtrim() to the script to remove the trailing slash in the header location, so if you want to link anything that doesn't remove trailing slashes on file links, it doesn't go to a dead link. As again the slashes weren't being effected by what behaviour I discovered as mentioned at top, this is fine here.

Here is how you can have things now.

function wild($req) {
    $get = $_GET['req'];
    return strpos($get, $req) === 0
        ? substr($get, strlen($req))
        : $get;
}

$domain = "http://domain.tld";

// Redirects
$req = "request1/";    $redirects[$req.wild($req)] = "$domain/dest1/".wild($req);
$req = "request2/";    $redirects[$req.wild($req)] = "$domain/dest2/".wild($req);
$req = "request3/";    $redirects[$req.wild($req)] = "$domain/dest3/".wild($req);

// Run Script
if (isset($_GET['req'], $redirects[$_GET['req']])) {
    $loc = htmlspecialchars($redirects[$_GET['req']]);
    header("Location: " . rtrim($loc,"/"));
    exit();
}

// If no match in the redirects, redirect to this location.
header("Location: $domain");

Now, this has one flaw if the destination is sending non existent requests to the script, if a destination, which is going to be guaranteed with wildcards, is non existent for the request, well.back it goes to the script and bam you have a redirect loop.

My way of solving this is to add ?referer=doma.in to the end of the header location, and in the htaccess on domain.tld, exclude non existent requests with that query string from redirecting back to the script.

So that looks like:

$loc = htmlspecialchars($redirects[$_GET['req']]).'?referer=doma.in';

And in the htaccess of domain.tld, place a rewritecond above the existing rule like so to exclude the query string:

# Ignore these referer queries
RewriteCond %{QUERY_STRING} !referer=doma.in [NC]

# Send dead requests to doma.in with uri as query
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ http://doma.in/?referer=domain.tld&req=$1/ [R=301,L]

For good measure I also added a referer on the redirect for the domain.tld.

Now, as a bonus, to hide the refer query on requests to tidy things up, let's add below:

# Send dead requests with referer query to home (or 404 or wherever)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{QUERY_STRING} "referer=" [NC]
RewriteRule (.*) /?req=$1 [R=301,L]

# Remove referer query from requests
RewriteCond %{QUERY_STRING} "referer=" [NC]
RewriteRule (.*) /$1/? [R=301,L]

We need to send dead referer query requests somewhere before we remove the query, otherwise we'd be back at step one. I have the dead requests sent to my homepage with the request uri as a parameter so that can still know what the request url was.

And job done. But as an extra bonus, let's make external/non wildcard redirects not have a query. So back in the script, change the script to be like so:

$get = $_GET['req'];
$loc = $redirects[$get];
$wildloc = $wildcards[$get];

// Run Script
if(isset($get) && isset($loc) || isset($wildloc)) {
    if(isset($wildcards[$get])) {
    $loc = rtrim($wildloc,'/').'?referer=hwl.li'; }
    $loc = rtrim(htmlspecialchars($loc),'/');
    header("Location: ".$loc);
    exit();
}

Here, I've moved things about with the $_GET['req'] assigned to $get, $redirects[$get] assigned as $loc, $wildcards[$get] assigned to $wildloc, and call them in the issets, along with an extra isset after an :, aka OR for $wildloc.

And then have an if statement so $wildcards redirects use $loc assigned to $wildloc, and $redirects ones use the above one.

This way, we can have tidy redirects.

So now things look like:

// Wildcard function
function wild($req) {
    $get = $_GET['req'];
    return strpos($get, $req) === 0
        ? substr($get, strlen($req))
        : $get;
}

$domain = "http://domain.tld";

// Redirects
$req = "request1/";   $wildcards[$req.wild($req)] = "$domain/dest1/".wild($req);  // A wildcard redirect
$req = "request2/";    $wildcards[$req.wild($req)] = "$domain/dest2/".wild($req);  // A wildcard redirect
$redirects['request3/'] = "$domain/dest3/"; // Not a wildcard redirect

$get = $_GET['req'];
$loc = $redirects[$get];
$wildloc = $wildcards[$get];

// Run Script
if(isset($get) && isset($loc) || isset($wildloc)) {
    if(isset($wildcards[$get])) {
    $loc = rtrim($wildloc,'/').'?referer=hwl.li';}
    $loc = rtrim(htmlspecialchars($loc),'/');
    header("Location: ".$loc);
    exit();
}

// If no match in the redirects, redirect to this location.
header("Location: $domain/?req=$get");

This improves things so much and solves the redirect loop.

Edited this again slightly as what I did here with the query string being appended.. the rtrim() therefore was looking for a non existent trailing slash after that, not where we wanted it to be doing it, before. So now the rtrim() comes before. Doubles it up which is slightly annoying but at least it does the job right now.

like image 112
Jase Wolf - theythem Avatar answered Mar 15 '26 09:03

Jase Wolf - theythem