Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pass a Persian string as a argument in the URL?

I have a URL like this pattern:

www.example.com/ClassName/MethodName/Arg1/Arg2

Also here is my .htaccess file:

RewriteEngine on

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

RewriteRule ^(.*)$ index.php?rt=$1 [L,QSA]

ErrorDocument 404 /error404.html

And this is my routing system:

if (empty($_GET['rt'])) {
    require_once('application/home.php');
} else {
    require_once('application/search.php');
    $url  =  rtrim ($_GET['rt'], '/');
    $url  =  explode('/', $url);

    $ClassName  =   array_shift($url);
    $MethodName =   array_shift($url);
    $Arg1       =   array_shift($url);
    $Arg2       =   array_shift($url);
}

Now what is the problem? Well, Everything is fine ..! Routing is completely fine for every URLs except when I use م in the URL. (م is a Persian character)

For e.g.

www.example.com/ClassName/Methodname/124/روز خوب      // it is fine
www.example.com/ClassName/Methodname/254/سلام بر       // it isn't fine
//                       because there is م ^ in the URL

So when I use م in the URL, I will faced with 404 Not Found page:

enter image description here


Well, I don't know that problem comes from where .. do you know? And how can I fix it? Is it a encoding issue? Or what?

Note: I use Xampp v3.2.1 (apache).


EDIT: As mentioned in the comment, I add these two examples:

<?php

$str = "www.example.com/ClassName/Methodname/124/روز خوب";
$url = explode('/', $str);
echo "<pre>";
print_r($url);

/*
Array
(
    [0] => www.example.com
    [1] => ClassName
    [2] => Methodname
    [3] => 124
    [4] => روز خوب
)
*/

Two: (this directs to 404 Not Found)

<?php

$str = "www.example.com/ClassName/Methodname/254/سلام بر";
$url = explode('/', $str);
echo "<pre>";
print_r($url);

/*    
Array
(
    [0] => www.example.com
    [1] => ClassName
    [2] => Methodname
    [3] => 254
    [4] => سلام بر
)
*/

EDIT2: According to a few tests, I figured out the script that should get called by my rewriting rules (index.php), it even doesn't get call.


EDIT3: I enabled rewrite logging on Apache and when I check the result, there is a interesting thing:

(These two samples aren't related to the above examples)

Working routing sample:

[Sat Jan 02 22:17:00.276918 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] add path info postfix: C:/xampp/htdocs/myweb/islamic_sources -> C:/xampp/htdocs/myweb/islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.276918 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] strip per-dir prefix: C:/xampp/htdocs/myweb/islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa -> islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.276918 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] applying pattern '^(.*)$' to uri 'islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa', referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.276918 2016] [rewrite:trace4] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] RewriteCond: input='C:/xampp/htdocs/myweb/islamic_sources' pattern='!-f' => matched, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.276918 2016] [rewrite:trace4] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] RewriteCond: input='C:/xampp/htdocs/myweb/islamic_sources' pattern='!-d' => matched, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.276918 2016] [rewrite:trace2] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] rewrite 'islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa' -> 'index.php?rt=islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa', referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.276918 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] split uri=index.php?rt=islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa -> uri=index.php, args=rt=islamic_sources/sahifeh_sajadiyeh/1580/\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.277919 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] add per-dir prefix: index.php -> C:/xampp/htdocs/myweb/index.php, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.277919 2016] [rewrite:trace2] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] strip document_root prefix: C:/xampp/htdocs/myweb/index.php -> /myweb/index.php, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.277919 2016] [rewrite:trace1] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#28661a0/initial] [perdir C:/xampp/htdocs/myweb/] internal redirect with /myweb/index.php [INTERNAL REDIRECT], referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.277919 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c8b0/initial/redir#1] [perdir C:/xampp/htdocs/myweb/] strip per-dir prefix: C:/xampp/htdocs/myweb/index.php -> index.php, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.277919 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c8b0/initial/redir#1] [perdir C:/xampp/htdocs/myweb/] applying pattern '^(.*)$' to uri 'index.php', referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.277919 2016] [rewrite:trace4] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c8b0/initial/redir#1] [perdir C:/xampp/htdocs/myweb/] RewriteCond: input='C:/xampp/htdocs/myweb/index.php' pattern='!-f' => not-matched, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.277919 2016] [rewrite:trace1] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c8b0/initial/redir#1] [perdir C:/xampp/htdocs/myweb/] pass through C:/xampp/htdocs/myweb/index.php, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.470250 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c1b8/initial] [perdir C:/xampp/htdocs/myweb/] strip per-dir prefix: C:/xampp/htdocs/myweb/fonts/taha/QuranTaha.woff -> fonts/taha/QuranTaha.woff, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.470250 2016] [rewrite:trace3] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c1b8/initial] [perdir C:/xampp/htdocs/myweb/] applying pattern '^(.*)$' to uri 'fonts/taha/QuranTaha.woff', referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.470250 2016] [rewrite:trace4] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c1b8/initial] [perdir C:/xampp/htdocs/myweb/] RewriteCond: input='C:/xampp/htdocs/myweb/fonts/taha/QuranTaha.woff' pattern='!-f' => not-matched, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85
[Sat Jan 02 22:17:00.470250 2016] [rewrite:trace1] [pid 3188:tid 1728] mod_rewrite.c(475): [client ::1:49413] ::1 - - [localhost/sid#c397b0][rid#286c1b8/initial] [perdir C:/xampp/htdocs/myweb/] pass through C:/xampp/htdocs/myweb/fonts/taha/QuranTaha.woff, referer: http://localhost/myweb/search?s=islamic_sources&q=%D8%B3%D9%84%D8%A7%D9%85

Not working (redirects to 404 not found) sample:

[Sat Jan 02 22:07:09.734092 2016] [rewrite:trace3] [pid 3188:tid 1712] mod_rewrite.c(475): [client ::1:64955] ::1 - - [localhost/sid#c397b0][rid#83ec138/initial] [perdir C:/xampp/htdocs/myweb/] add path info postfix: C:/xampp/htdocs/myweb/islamic_sources -> C:/xampp/htdocs/myweb/islamic_sources/sahifeh_sajadiyeh/306/\xd9\x85\xd8\xa8
[Sat Jan 02 22:07:09.734092 2016] [rewrite:trace3] [pid 3188:tid 1712] mod_rewrite.c(475): [client ::1:64955] ::1 - - [localhost/sid#c397b0][rid#83ec138/initial] [perdir C:/xampp/htdocs/myweb/] strip per-dir prefix: C:/xampp/htdocs/myweb/islamic_sources/sahifeh_sajadiyeh/306/\xd9\x85\xd8\xa8 -> islamic_sources/sahifeh_sajadiyeh/306/\xd9\x85\xd8\xa8
[Sat Jan 02 22:07:09.734092 2016] [rewrite:trace3] [pid 3188:tid 1712] mod_rewrite.c(475): [client ::1:64955] ::1 - - [localhost/sid#c397b0][rid#83ec138/initial] [perdir C:/xampp/htdocs/myweb/] applying pattern '^(.*)$' to uri 'islamic_sources/sahifeh_sajadiyeh/306/\xd9\x85\xd8\xa8'
[Sat Jan 02 22:07:09.734092 2016] [rewrite:trace1] [pid 3188:tid 1712] mod_rewrite.c(475): [client ::1:64955] ::1 - - [localhost/sid#c397b0][rid#83ec138/initial] [perdir C:/xampp/htdocs/myweb/] pass through C:/xampp/htdocs/myweb/islamic_sources

Interesting point: when routing is fine, that Persian string will be like this: (just decode):

%D8%B3%D9%84%D8%A7%D9%85

But when routing is 404 not found, the Persian string will be like this:

\xd9\x86\xd8\xa8\xd9\x88\xd8\xaa

Seems there is two different kinds of encoding ..

like image 533
Shafizadeh Avatar asked Jan 02 '16 09:01

Shafizadeh


1 Answers

You could try this variant that may work better:

RewriteRule ^([\s\S]*)$ index.php?rt=$1 [L,B,QSA]

The changes that this makes are:

1: using [\s\S] to match absolutely any character, instead of . which matches anything but a newline.

Though you wouldn't normally expect newline (%0A) to be in your URLs, my suspicion is that Apache's regexp matcher is treating your input path as being in the ISO-8859-1 encoding.

The IRI character U+0645 Arabic Letter Meem م UTF-8-URL-encodes to URI sequence %D9%85, and whilst byte 0xD9 is okay in ISO-8859-1, 0x85 decodes to U+0085 Next Line (NEL), an undesirable legacy control character that often counts as a newline. So if that happened, the expression .* wouldn't match it.

Having said all that, this is quite theoretical as your example works as-is for me, on an old XAMPP 1.8.2 I had lying about on WinXP.

2: using the [B] rewrite flag, to ensure all bytes are passed in correctly-URL-encoded form in the parameter.

Otherwise, non-ASCII characters would break for situations where Apache sends the query string to PHP through Windows environment variables. The Windows environment is Unicode, so Apache has to decode the bytes on writing and PHP has to encode them again on reading, and unfortunately those encodings don't match.

Apache uses ISO-8859-1 and PHP (via C stdlib) uses the ANSI code page, which depends on the locale of the Windows installation. On a Western install, you get code page 1252, which is close to ISO-8859-1 so only some of the bytes will be wrong (again, this includes the 0x85 in م); on other locales with other ANSI code pages all the non-ASCII characters will be wildly wrong.

This doesn't necessarily apply to you as XAMPP is using mod_php, which doesn't need to use the environment to pass strings. But it would make a difference in other hosting environments. In any case, without [B] you'll find URL-special characters in the string (ampersand, plus, percent) break the query parser.

like image 160
bobince Avatar answered Oct 18 '22 04:10

bobince