Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Superscript character in PHP causing a MySQLi select query to find 0 rows

I am using PHP 5.3.3 and MySQL 5.1.61. The column in question is using UTF-8 encoding and the PHP file is encoded in UTF-8 without BOM.

When doing a MySQLi query with a ² character in SQLyog on Windows, the query executes properly and the correct search result displays.

If I do this same exact query in PHP, it will execute but will show 0 affected_rows.

Here's what I tried:

  • Using both LIKE instead of =
  • Changing the encoding of the PHP file to ANSI, UTF-8 without BOM, and UTF-8
  • Doing 'SET NAMES utf-8' and 'latin1' before running the query
  • Did header('Content-Type: text/html; charset=UTF-8'); in PHP
  • Escaping using MySQLi::real_escape_string
  • Doing a filter_var($String, FILTER_SANITIZE_STRING)
  • Tried a MySQLi stmt bind

The only way I could get it to work properly is if I swapped the ² for a % and changed = to LIKE in PHP.

How can I get it query properly in PHP when using the ²?

like image 676
Kevin Ghadyani Avatar asked Sep 05 '12 08:09

Kevin Ghadyani


1 Answers

You should be able to get the query to work by ensuring the following:

Prepping PHP for UTF-8

You first need to make sure the PHP pages that will be issuing these queries are served as UTF-8 encoded pages. This will ensure that any UTF-8 output coming from the database is displayed properly. In Firefox, you can check to see if this is the case by visiting the page you're interested in and using the View Page Info menu item. When you do so, you should see UTF-8 as the value for the page's Encoding. If the page isn't being served as UTF-8, you can do so one of two ways. Either you can set the encoding in a call to header(), like this:

header('Content-Type: text/html; charset=UTF-8');

Or, you can use a meta tag in your page's head block:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

Prepping MySQL for UTF-8

Next up, you need to make sure the database is set up to use the UTF-8 encoding. This can be set at the server, database, table, or column levels. If you're on a shared host, you probably can only control the table and column levels of your hierarchy. If you have control of the server or database, you can check to see what character encoding they are using by issuing these two commands:

SHOW VARIABLES LIKE 'character_set_system';
SHOW VARIABLES LIKE 'character_set_database';

Changing the database level encoding can be done using a command like this:

(CREATE | ALTER) DATABASE ... DEFAULT CHARACTER SET utf8;

To see what character encoding a table uses, simply do:

SHOW CREATE TABLE myTable;

Similarly, here's how to change a table-level encoding:

(CREATE | ALTER) TABLE ... DEFAULT CHARACTER SET utf8;

I recommend setting the encoding as high as you possibly can in the hierarchy. This way, you don't have to remember to manually set it for new tables. Now, if your character encoding for a table is not already set to UTF-8, you can attempt to convert it using an alter statement like this:

ALTER TABLE ... CONVERT TO CHARACTER SET utf8;

Be very careful about using this statement! If you already have UTF-8 values in your tables, they may become corrupted when you attempt to convert. There are some ways to get around this, however.

Forcing MySQLi to Use UTF-8

Finally, before you connect to your database, make sure you issue the appropriate call to say that you are using the UTF-8 encoding. Here's how:

$db = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);

// Change the character set to UTF-8 (have to do it early)
if(! $db->set_charset("utf8"))
{
    printf("Error loading character set utf8: %sn", $db->error);
}

Once you do that, everything should hopefully work as expected. The only characters you need to worry about encoding are the big 5 for HTML: <, >, ', ", and &. You can handle that using the htmlspecialchars() function.

If you want to read more (and get links to additional resources), feel free to check out the articles I wrote about this process. There are two parts: Unicode and the Web: Part 1, and Unicode and the Web: Part 2. Good luck!

like image 195
Jonah Bishop Avatar answered Oct 05 '22 23:10

Jonah Bishop