A few years ago, I posted an answer to a question about a way, in PHP, to let the user pass in the URI the relative path to the file to download, while preventing directory traversal.
I got a few comments telling that the code is insecure, and a few downvotes (the most recent being today). Here's the code:
$path = $_GET['path'];
if (strpos($path, '../') !== false ||
strpos($path, "..\\") !== false ||
strpos($path, '/..') !== false ||
strpos($path, '\..') !== false)
{
// Strange things happening.
}
else
{
// The request is probably safe.
if (file_exists(dirname(__FILE__) . DIRECTORY_SEPARATOR . $path))
{
// Send the file.
}
else
{
// Handle the case where the file doesn't exist.
}
}
I reviewed the code again and again, tested it, and still can't understand what's the security issue it introduces.
The only hint I got in the comments is that ../
can be replaced by %2e%2e%2f
. This is not an issue, since PHP will automatically transform it into ../
.
What is the problem with this piece of code? What could be the value of the input which would allow directory traversal or break something in some way?
There are lots of other possibilities that could slip through, such as:
.htaccess
some-secret-file-with-a-password-in-it.php
In other words, anything in the directory or a subdirectory would be accessible, including .htaccess
files and source code. If anything in that directory or its subdirectories should not be downloadable, then that's a security hole.
I've just ran your code through Burp intruder and cannot find any way round it in this case.
It was probably down voted due to exploits against other/old technology stacks which employed a similar approach by blacklisting certain character combinations.
As you mention, the current version of PHP automatically URL decodes input, but there have been flaws where techniques such as double URL encoding (dot = %252e
), 16 bit Unicode encoding (dot = %u002e
), overlong UTF-8 Unicode encoding (dot = %c0%2e
) or inserting a null byte (%00
) could trick the filter and allow the server side code to interpret the path as the unencoded version once it had been given a thumbs up by the filter.
This is why it has set alarm bells ringing. Even though your approach appears to work here, generally it may not be the case. Technology is always changing and it is always best to err on the side of caution and use techniques that are immune to character set interpretations wherever possible such as using whitelists of known good characters that will likely to be always good, or using a file system function (realpath was mentioned in the linked answer) to verify that the actual path is the one you're expecting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With