How do you sanitize data in $_GET -variables by PHP?
I sanitize only one variable in GET by strip_tags
.
I am not sure whether I should sanitize everything or not, because last time in putting data to Postgres, the problem was most easily solved by the use of pg_prepare
.
Sanitizing data means removing any illegal character from the data. Sanitizing user input is one of the most common tasks in a web application. To make this task easier PHP provides native filter extension that you can use to sanitize the data such as e-mail addresses, URLs, IP addresses, etc.
The important point when using PDO is: PDO will only sanitize it for SQL, not for your application. So yes, for writes, such as INSERT or UPDATE, it's especially critical to still filter your data first and sanitize it for other things (removal of HTML tags, JavaScript, etc).
How do you sanitize data in $_GET -variables by PHP?
You do not sanitize data in $_GET. This is a common approach in PHP scripts, but it's completely wrong*.
All your variables should stay in plain text form until the point when you embed them in another type of string. There is no one form of escaping or ‘sanitization’ that can cover all possible types of string you might be embedding your values into.
So if you're embedding a string into an SQL query, you need to escape it on the way out:
$sql= "SELECT * FROM accounts WHERE username='".pg_escape_string($_GET['username'])."'";
And if you're spitting the string out into HTML, you need to escape it then:
Cannot log in as <?php echo(htmlspecialchars($_GET['username'], ENT_QUOTES)) ?>.
If you did both of these escaping steps on the $_GET array at the start, as recommended by people who don't know what they're doing:
$_GET['username']= htmlspecialchars(pg_escape_string($_GET['username']));
Then when you had a ‘&’ in your username, it would mysteriously turn into ‘&’ in your database, and if you had an apostrophe in your username, it would turn into two apostrophes on the page. Then when you have a form with these characters in it is easy to end up double-escaping things when they're edited, which is why so many bad PHP CMSs end up with broken article titles like “New books from O\\\\\\\\\\\\\\\\\\\'Reilly”.
Naturally, remembering to pg_escape_string or mysql_real_escape_string, and htmlspecialchars every time you send a variable out is a bit tedious, which is why everyone wants to do it (incorrectly) in one place at the start of the script. For HTML output, you can at least save some typing by defining a function with a short name that does echo(htmlspecialchars(...)).
For SQL, you're better off using parameterised queries. For Postgres there's pg_query_params. Or indeed, prepared statements as you mentioned (though I personally find them less managable). Either way, you can then forget about ‘sanitizing’ or escaping for SQL, but you must still escape if you embed in other types of string including HTML.
strip_tags() is not a good way of treating input for HTML display. In the past it has had security problems, as browser parsers are actually much more complicated in their interpretation of what a tag can be than you might think. htmlspecialchars() is almost always the right thing to use instead, so that if someone types a less-than sign they'll actually get a literal less-than sign and not find half their text mysteriously vanishing.
(*: as a general approach to solving injection problems, anyway. Naturally there are domain-specific checks it is worth doing on particular fields, and there are useful cleanup tasks you can do like removing all control characters from submitted values. But this is not what most PHP coders mean by sanitization.)
If you're talking about sanitizing output, I would recommend storing content in your database in it's full, unescaped form, and then escaping it (htmlspecialchars or something) when you are echoing out the data, that way you have more options for outputting. See this question for a discussion of sanitising/escaping database content.
In terms of storing in postgres, use pg_escape_string on each variable in the query, to escape quotes, and generally protect against SQL injection.
Edit:
My usual steps for storing data in a database, and then retrieving it, are:
Call the database data escaping function (pg_escape_string, mysql_escape_string, etc), to escape each incoming $_GET variable used in your query. Note that using these functions instead of addslashes results in not having extra slashes in the text when stored in the database.
When you get the data back out of the database, you can just use htmlspecialchars on any outputted data, no need to use stripslashes, since there should be no extra slashes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With