I have a html form that accepts user entered text of size about 1000, and is submitted to a php page where it will be stored in mysql database. I use PDO with prepared statements to prevent sql injection. But to sanitize the text entered by user, what are the best efforts needed to do ?
I want to prevent any script injection, xss attacks, etc.
The Basics. The first lesson anyone learns when setting up a web-to-database—or anything-to-database gateway where untrusted user input is concerned—is to always, always sanitize every input.
Sanitization is the process of removing sensitive information from a document or other message (or sometimes encrypting it), so that the document may be distributed to a broader audience.
Input sanitization is a cybersecurity measure of checking, cleaning, and filtering data inputs from users, APIs, and web services of any unwanted characters and strings to prevent the injection of harmful codes into the system.
Security is an interesting concept and attracts a lot of people to it. Unfortunately it's a complex subject and even the professionals get it wrong. I've found security holes in Google (CSRF), Facebook (more CSRF), several major online retailers (mainly SQL injection / XSS), as well as thousands of smaller sites both corporate and personal.
These are my recommendations:
1) Use parameterised queries
Parameterised queries force the values passed to the query to be treated as separate data, so that the input values cannot be parsed as SQL code by the DBMS. A lot of people will recommend that you escape your strings using mysql_real_escape_string()
, but contrary to popular belief it is not a catch-all solution to SQL injection. Take this query for example:
SELECT * FROM users WHERE userID = $_GET['userid']
If $_GET['userid']
is set to 1 OR 1=1
, there are no special characters and it will not be filtered. This results in all rows being returned. Or, even worse, what if it's set to 1 OR is_admin = 1
?
Parameterised queries prevent this kind of injection from occuring.
2) Validate your inputs
Parameterised queries are great, but sometimes unexpected values might cause problems with your code. Make sure that you're validating that they're within range and that they won't allow the current user to alter something they shouldn't be able to.
For example, you might have a password change form that sends a POST request to a script that changes their password. If you place their user ID as a hidden variable in the form, they could change it. Sending id=123
instead of id=321
might mean they change someone else's password. Make sure that EVERYTHING is validated correctly, in terms of type, range and access.
3) Use htmlspecialchars to escape displayed user-input
Let's say your user enters their "about me" as something like this:</div><script>document.alert('hello!');</script><div>
The problem with this is that your output will contain markup that the user entered. Trying to filter this yourself with blacklists is just a bad idea. Use htmlspecialchars
to filter out the strings so that HTML tags are converted to HTML entities.
4) Don't use $_REQUEST
Cross-site request forgery (CSRF) attacks work by getting the user to click a link or visit a URL that represents a script that perfoms an action on a site for which they are logged in. The $_REQUEST
variable is a combination of $_GET
, $_POST
and $_COOKIE
, which means that you can't tell the difference between a variable that was sent in a POST request (i.e. through an input
tag in your form) or a variable that was set in your URL as part of a GET (e.g. page.php?id=1
).
Let's say the user wants to send a private message to someone. They might send a POST request to sendmessage.php
, with to
, subject
and message
as parameters. Now let's imagine someone sends a GET request instead:
sendmessage.php?to=someone&subject=SPAM&message=VIAGRA!
If you're using $_POST
, you won't see any of those parameters, as they are set in $_GET
instead. Your code won't see the $_POST['to']
or any of the other variables, so it won't send the message. However, if you're using $_REQUEST
, the $_GET
and $_POST
get stuck together, so an attacker can set those parameters as part of the URL. When the user visits that URL, they inadvertantly send the message. The really worrysome part is that the user doesn't have to do anything. If the attacker creates a malicious page, it could contain an iframe
that points to the URL. Example:
<iframe src="http://yoursite.com/sendmessage.php?to=someone&subject=SPAM&message=VIAGRA!">
</iframe>
This results in the user sending messages to people without ever realising they did anything. For this reason, you should avoid $_REQUEST
and use $_POST
and $_GET
instead.
5) Treat everything you're given as suspicious (or even malicious)
You have no idea what the user is sending you. It could be legitimate. It could be an attack. Never trust anything a user has sent you. Convert to correct types, validate the inputs, use whitelists to filter where necessary (avoid blacklists). This includes anything sent via $_GET
, $_POST
, $_COOKIE
and $_FILES
.
If you follow these guidelines, you're at a reasonable standing in terms of security.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With