Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP / MySQL - Safe characters for display names / usernames / passwords, with PDO

a bit of a PHP / MySQL newbie here...

I've been building a PHP-based site that uses a MySQL database for storing user information, like their display names, usernames, and passwords.

I've been learning about escaping, prepared statements and the like, and how to prevent SQL injections like "bobby'); drop table users--".

I'm using PDO prepared statements to get user input from forms, in order to register them into the DB. However, I need to know a few things:

  1. Since I am using prepared statements, for display names, usernames, passwords, etc, is it okay for me to allow special characters like @, #, $, or even 'single' or "double" quotes? And what about spaces, international characters, characters with accents, or things like ♥ ? And when I ask if it's "okay" to allow these characters, I'm wondering if there are any further security risks that may arise from allowing quotes or parentheses in people's usernames, or things like html tags for bold or italics?

  2. If it is okay to allow most special characters, but not some: are there any specific "dangerous" characters (in the scope of MySQL) which I absolutely need to make illegal? (I feel like quotes may fit this agenda, but I'm getting mixed signals on that.)

  3. If I were to allow characters outside of the typical "alphanumeric and underscore" range, are there any pitfalls I may experience later (in MySQL, SQL, or PHP) from allowing strange characters? Will I need to somehow make html tags appear as strings, rather than actual tags, when displaying people's usernames? Or would I need to escape quotes in people's usernames whenever I wanted to query with them? Or does none of this matter since I'll be using prepared statements with PDO?

  4. Do charsets like utf8 or utf16 come in anywhere, in making it so I can accept the widest range of display names and usernames, while still making sure those alphabets can be rendered on my website?

  5. I know that there are some Cyrillic letters that look identical to English ones. I used to copy these straight out of MS Word and use them in my usernames. I realize that these can be used to perceptually-impersonate other members, simply by swapping out an English "a" for a Cyrillic "a". Usernames with ♥ in them may be hard to search for if someone isn't well-versed in alt-code. Should this be a concern? What is your opinion on this?

Thanks in advance to whoever can give me some insight on this.

like image 909
Jackson Avatar asked Jun 20 '12 05:06

Jackson


People also ask

What does MySQL_ real_ escape_ string do?

mysql_real_escape_string() calls MySQL's library function mysql_real_escape_string, which prepends backslashes to the following characters: \x00 , \n , \r , \ , ' , " and \x1a . This function must always (with few exceptions) be used to make data safe before sending a query to MySQL.

What is MySQL_ escape_ string?

mysql_escape_string is one of PHP mysql extension functions. It escapes a string provided as parameter for the function. Escapes means prepends backslash ( \ ) to special characters. mysql_escape_string is designed to be used with mysql_query function, to safely pass MySQL query parameters to the query.


1 Answers

This SQL Injection Cheat Sheet has several examples of MySQL queries you can test while still in development.

It's a great resource for learning about some of your questions on what is "Acceptable", and you have to consider the entire lifecycle of "a piece of data".

Typically a piece of data might start in a HTML form and then get POSTed to your PHP script (so, if the user wants they can just POST data directly without the form). Then your php script (hopefully) sanitizes the data, then it is Stored.

While in the database, you might be doing backup operations, saving it to an SQLDump, or some other sort of maintenance.

Then obviously the data will be Read at some point, if it's a markdown language it might get compiled, and eventually it is sent to someone's browser where it's probably injected into html and displayed.

As you can see there are a whole lot of places in a piece of data's lifecycle where things can go awry. If you fail to consider all of these, you may see some common errors like backslashes that pile up on themselves each time you save/load the data.. sql errors, becoming vulnerable to attacks, etc.

What kind of data do you want to support? That's up to you. Just make sure you'd handling it correctly.

like image 144
Dean Rather Avatar answered Oct 21 '22 09:10

Dean Rather