Can someone explain me when I set everything to UTF-8 I keep getting those damn ���
MySQL
Server version: 5.1.44
MySQL charset: UTF-8 Unicode (utf8)
I create a new database
name: utf8test
collation: utf8_general_ci
MySQL connection collation: utf8_general_ci
My SQL looks like this:
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
CREATE TABLE IF NOT EXISTS `test_table` (
`test_id` int(11) NOT NULL,
`test_text` text NOT NULL,
PRIMARY KEY (`test_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `test_table` (`test_id`, `test_text`) VALUES
(1, 'hééélo'),
(2, 'wööörld');
My PHP / HTML:
<?php
$db_conn = mysql_connect("localhost", "root", "") or die("Can't connect to db");
mysql_select_db("utf8test", $db_conn) or die("Can't select db");
// $result = mysql_query("set names 'utf8'"); // this works... why??
$query = "SELECT * FROM test_table";
$result = mysql_query($query);
$output = "";
while($row = mysql_fetch_assoc($result)) {
$output .= "id: " . $row['test_id'] . " - text: " . $row['test_text'] . "<br />";
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="it" xmlns="http://www.w3.org/1999/xhtml" xml:lang="it">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>UTF-8 test</title>
</head>
<body>
<?php echo $output; ?>
</body>
</html>
The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.
To change the character set encoding to UTF-8 for the database itself, type the following command at the mysql> prompt. Replace dbname with the database name: Copy ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci; To exit the mysql program, type \q at the mysql> prompt.
utf8 has been used by MySQL is an alias for the utf8mb3 character set, but this usage is being phased out; as of MySQL 8.0. 28, SHOW statements and columns of Information Schema tables display utf8mb3 instead. For more information, see Section 10.9. 2, “The utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding)”.
If you're using MySQL 8.0, the default charset is utf8mb4. If you elect to use UTF-8 as your collation, always use utf8mb4 (specifically utf8mb4_unicode_ci). You should not use UTF-8 because MySQL's UTF-8 is different from proper UTF-8 encoding.
Try to set charachter encoding after mysql_connect
function like this:
mysql_query ("set character_set_client='utf8'");
mysql_query ("set character_set_results='utf8'");
mysql_query ("set collation_connection='utf8_general_ci'");
I set everything to UTF-8
Not quite.
You have to tell mysql your client's encoding.
As a matter of fact, you don't have to set up "everything" in utf-8. You can have your tables in latin1 and output in utf-8. Or contrary.
Very flexible.
But you have to set up client's encoding explicitly.
So, that's why it works with set names utf8
. Because this query setting up client's encoding. And let Mysql know that data must be sent in utf-8. Pretty sensible, huh?
Also I have to mention your SQL dump. It needs same setting. Just SET NAMES somewhere at the top. Because you are sending these queries from some client too. And this client's encoding needs to be set up as well.
And one more thing to mention: be sure your server sending proper encoding in the Content-type header. You didn't set it to UTF-8 too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With