Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get PHP to stop replacing '.' characters in $_GET or $_POST arrays?

Here's PHP.net's explanation of why it does it:

Dots in incoming variable names

Typically, PHP does not alter the names of variables when they are passed into a script. However, it should be noted that the dot (period, full stop) is not a valid character in a PHP variable name. For the reason, look at it:

<?php
$varname.ext;  /* invalid variable name */
?>

Now, what the parser sees is a variable named $varname, followed by the string concatenation operator, followed by the barestring (i.e. unquoted string which doesn't match any known key or reserved words) 'ext'. Obviously, this doesn't have the intended result.

For this reason, it is important to note that PHP will automatically replace any dots in incoming variable names with underscores.

That's from http://ca.php.net/variables.external.

Also, according to this comment these other characters are converted to underscores:

The full list of field-name characters that PHP converts to _ (underscore) is the following (not just dot):

  • chr(32) ( ) (space)
  • chr(46) (.) (dot)
  • chr(91) ([) (open square bracket)
  • chr(128) - chr(159) (various)

So it looks like you're stuck with it, so you'll have to convert the underscores back to dots in your script using dawnerd's suggestion (I'd just use str_replace though.)


Long-since answered question, but there is actually a better answer (or work-around). PHP lets you at the raw input stream, so you can do something like this:

$query_string = file_get_contents('php://input');

which will give you the $_POST array in query string format, periods as they should be.

You can then parse it if you need (as per POSTer's comment)

<?php
// Function to fix up PHP's messing up input containing dots, etc.
// `$source` can be either 'POST' or 'GET'
function getRealInput($source) {
    $pairs = explode("&", $source == 'POST' ? file_get_contents("php://input") : $_SERVER['QUERY_STRING']);
    $vars = array();
    foreach ($pairs as $pair) {
        $nv = explode("=", $pair);
        $name = urldecode($nv[0]);
        $value = urldecode($nv[1]);
        $vars[$name] = $value;
    }
    return $vars;
}

// Wrapper functions specifically for GET and POST:
function getRealGET() { return getRealInput('GET'); }
function getRealPOST() { return getRealInput('POST'); }
?>

Hugely useful for OpenID parameters, which contain both '.' and '_', each with a certain meaning!


Highlighting an actual answer by Johan in a comment above - I just wrapped my entire post in a top-level array which completely bypasses the problem with no heavy processing required.

In the form you do

<input name="data[database.username]">  
<input name="data[database.password]">  
<input name="data[something.else.really.deep]">  

instead of

<input name="database.username"> 
<input name="database.password"> 
<input name="something.else.really.deep">  

and in the post handler, just unwrap it:

$posdata = $_POST['data'];

For me this was a two-line change, as my views were entirely templated.

FYI. I am using dots in my field names to edit trees of grouped data.


Do you want a solution that is standards compliant, and works with deep arrays (for example: ?param[2][5]=10) ?

To fix all possible sources of this problem, you can apply at the very top of your PHP code:

$_GET    = fix( $_SERVER['QUERY_STRING'] );
$_POST   = fix( file_get_contents('php://input') );
$_COOKIE = fix( $_SERVER['HTTP_COOKIE'] );

The working of this function is a neat idea that I came up during my summer vacation of 2013. Do not be discouraged by a simple regex, it just grabs all query names, encodes them (so dots are preserved), and then uses a normal parse_str() function.

function fix($source) {
    $source = preg_replace_callback(
        '/(^|(?<=&))[^=[&]+/',
        function($key) { return bin2hex(urldecode($key[0])); },
        $source
    );

    parse_str($source, $post);
    
    $result = array();
    foreach ($post as $key => $val) {
        $result[hex2bin($key)] = $val;
    }
    return $result;
}

p.s.: If you use this solution in your project, please attribute the function with @author Rok Kralj.


This happens because a period is an invalid character in a variable's name, the reason for which lies very deep in the implementation of PHP, so there are no easy fixes (yet).

In the meantime you can work around this issue by:

  1. Accessing the raw query data via either php://input for POST data or $_SERVER['QUERY_STRING'] for GET data
  2. Using a conversion function.

The below conversion function (PHP >= 5.4) encodes the names of each key-value pair into a hexadecimal representation and then performs a regular parse_str(); once done, it reverts the hexadecimal names back into their original form:

function parse_qs($data)
{
    $data = preg_replace_callback('/(?:^|(?<=&))[^=[]+/', function($match) {
        return bin2hex(urldecode($match[0]));
    }, $data);

    parse_str($data, $values);

    return array_combine(array_map('hex2bin', array_keys($values)), $values);
}

// work with the raw query string
$data = parse_qs($_SERVER['QUERY_STRING']);

Or:

// handle posted data (this only works with application/x-www-form-urlencoded)
$data = parse_qs(file_get_contents('php://input'));

This approach is an altered version of Rok Kralj's, but with some tweaking to work, to improve efficiency (avoids unnecessary callbacks, encoding and decoding on unaffected keys) and to correctly handle array keys.

A gist with tests is available and any feedback or suggestions are welcome here or there.

public function fix(&$target, $source, $keep = false) {                        
    if (!$source) {                                                            
        return;                                                                
    }                                                                          
    $keys = array();                                                           

    $source = preg_replace_callback(                                           
        '/                                                                     
        # Match at start of string or &                                        
        (?:^|(?<=&))                                                           
        # Exclude cases where the period is in brackets, e.g. foo[bar.blarg]
        [^=&\[]*                                                               
        # Affected cases: periods and spaces                                   
        (?:\.|%20)                                                             
        # Keep matching until assignment, next variable, end of string or   
        # start of an array                                                    
        [^=&\[]*                                                               
        /x',                                                                   
        function ($key) use (&$keys) {                                         
            $keys[] = $key = base64_encode(urldecode($key[0]));                
            return urlencode($key);                                            
        },                                                                     
    $source                                                                    
    );                                                                         

    if (!$keep) {                                                              
        $target = array();                                                     
    }                                                                          

    parse_str($source, $data);                                                 
    foreach ($data as $key => $val) {                                          
        // Only unprocess encoded keys                                      
        if (!in_array($key, $keys)) {                                          
            $target[$key] = $val;                                              
            continue;                                                          
        }                                                                      

        $key = base64_decode($key);                                            
        $target[$key] = $val;                                                  

        if ($keep) {                                                           
            // Keep a copy in the underscore key version                       
            $key = preg_replace('/(\.| )/', '_', $key);                        
            $target[$key] = $val;                                              
        }                                                                      
    }                                                                          
}                                                                              

The reason this happens is because of PHP's old register_globals functionality. The . character is not a valid character in a variable name, so PHP coverts it to an underscore in order to make sure there's compatibility.

In short, it's not a good practice to do periods in URL variables.