Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding an url correctly while using a rewrite engine

I am using a mod_rewrite and a routing system extracted from Akelos Framework.

I have a very big problem while using some symbols in a search key parameter.

A routing map is following:

$map->connect(":lang/search/:string", array('controller' => 'search','action' => 'index'));

In controller now I get $this->registry->map['params']['get']['string'] as a search keyword.

I can't find a way to properly encode a url. For example let's take a string t\ /#%&=

urlencode() gives t%5C+%2F%23%25%26%3D and page displays The requested URL /site/en/search/t\+/#%&= was not found on this server.

rawurlencode() gives t%5C%20%2F%23%25%26%3D and page displays the same.

You can download or view a router class source here on this page

I really do not want to use base64 for url and such encodings by which you can't read anything.

In case if you need here is a .htaccess file contents as well:

<IfModule mod_rewrite.c>
   RewriteEngine On
   RewriteCond %{REQUEST_FILENAME} !-d
   RewriteCond %{REQUEST_FILENAME} !-f
   RewriteRule ^(.*)$ index.php?url=$1 [QSA,L]
</IfModule>

Update

Here are actually working files for making tests.

Please download these files and test on your server if you have time.

Guide:

controllerclass.php - Simple controller framework, enables searchcontroller.php to work by defining a class "Controller" in it

routerclass.php - A router class extracted from Akelos Framework, bug is probably there

routes.php - A place where you define your routs, in our case we have only /search/:string

searchcontroller.php - A basic application to test strings - /search/stringhere points to this file

index.php - Where all the initiation and routing happens to begin

.htaccess - I do not think an error is here

I think you won't need to make changes in index.php, controllerclass.php, routes.php, searchcontroller.php

A bug is probably in routerclass.php or maybe there is some fix needed in .htaccess which I don't believe.

like image 439
Davit Avatar asked Jan 10 '13 19:01

Davit


People also ask

Does the URL in the browser change when a rewrite happens?

A rewrite is a server-side rewrite of the URL before it's fully processed by IIS. This will not change what you see in the browser because the changes are hidden from the user. Useful for search engine optimization by causing the search engine to update the URL.

What is meant by URL rewriting?

URL manipulation, also called URL rewriting, is the process of altering (often automatically by means of a program written for that purpose) the parameters in a URL (Uniform Resource Locator). URL manipulation can be employed as a convenience by a Web server administrator, or for nefarious purposes by a hacker.

Why do we need URL rewrite?

URL Rewrite permits Web administrators to easily replace the URLs generated by a Web application in the response HTML with a more user friendly and search engine friendly equivalent. Links can be modified in the HTML markup generated by a Web application behind a reverse proxy.


1 Answers

Looks like the issue is about RFC 3986 Section 7.3 (Back-End Transcoding) regarding urlencode and urldecode. I've slightly modified the function at http://php.net/manual/en/function.urlencode.php#97969:

function myUrlEncode($string) {
    $entities = array('%21', '%2A', '%27', '%28', '%29', '%3B', '%3A', '%40', '%26', '%3D', '%2B', '%24', '%2C', '%2F', '%5C', '%3F', '%25', '%23', '%5B', '%5D');
    $replacements = array('!', '*', "'", "(", ")", ";", ":", "@", "&", "=", "+", "$", ",", "/", "\\", "?", "%", "#", "[", "]");
    return htmlspecialchars(str_replace($entities, $replacements, urlencode($string)));
}

Note the addition of %5C => \ and htmlspecialchars() (htmlspecialchars is about security rather than being able to use special characters. The input may be <script>...or <h1>... etc :) ).

So you will be using it like:

print("<b><i>URL Encode Tests</i></b><br /><br />
    <b>Works:</b> ".myUrlEncode($string[0])." <a href=\"".HTTP_ROOT."/search/".myUrlEncode($string[0])."\">/search/".myUrlEncode($string[0])."</a><br />
    <b>Does not work:</b> ".myUrlEncode($string[1])." <a href=\"".HTTP_ROOT."/search/".myUrlEncode($string[1])."\">/search/".myUrlEncode($string[1])."</a><br />
    <b>Does not work:</b> ".myUrlEncode($string[2])." <a href=\"".HTTP_ROOT."/search/".myUrlEncode($string[2])."\">/search/".myUrlEncode($string[2])."</a><br />
");

After doing that, the search string #3 (\ /#%&=) gives a PHP error like "Method SearchController::t is invalid in ...\index.php on line 30". I guess this is about the regexes in the router, so you may need to do a few adjustments there.

like image 131
Halil Özgür Avatar answered Oct 06 '22 00:10

Halil Özgür