Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting latin1_swedish_ci to utf8 with PHP

I have a database filled with values like ♥•â—♥ Dhaka ♥•â—♥ (Which should be ♥•●♥ Dhaka ♥•●♥) as I didnt specify the collation while creating the database.
Now I want to Fix it. I cannot fetch the data again from where I got it from at the first place. So I was thinking if it might be possible to fetch the data in a php script and convert it to the correct characters.
I've changed the collation of the database and the fields to utf8_general_ci..

like image 591
Bibhas Debnath Avatar asked Jul 11 '11 07:07

Bibhas Debnath


People also ask

How to set encoding to UTF-8 in PHP?

The first thing you need to do is to modify your php. ini file to use UTF-8 as the default character set: default_charset = "utf-8"; (Note: You can subsequently use phpinfo() to verify that this has been set properly.)

How do I change MySQL encoding to UTF-8?

To change the character set encoding to UTF-8 for the database itself, type the following command at the mysql> prompt. Replace dbname with the database name: Copy ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci; To exit the mysql program, type \q at the mysql> prompt.

What is UTF-8 PHP?

Definition and Usage. The utf8_encode() function encodes an ISO-8859-1 string to UTF-8. Unicode is a universal standard, and has been developed to describe all possible characters of all languages plus a lot of symbols with one unique number for each character/symbol.

What is UTF-8 in MySQL?

MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character. utf8mb3 : A UTF-8 encoding of the Unicode character set using one to three bytes per character. This character set is deprecated in MySQL 8.0, and you should use utfmb4 instead.


2 Answers

The collation is NOT the same as the character set. The collation is only used for sorting and comparison of text (that's why there's a language term in there). The actual character set may be different.

The most common failure is not in the database but rather in the connection between PHP and MySQL. The default charset for the connection is usually ISO-8859-1. You need to change that the first thing you do after connecting, using either the SQL query SET NAMES 'utf-8'; or the mysql_set_charset function.

Also check the character set of your tables. This may be wrong as well if you have not specified UTF-8 to begin with (again: this is not the same as the collation). But make sure to take a backup before changing anything here. MySQL will try to convert the charset from the previous one, so you may need to reload the data from backup if you have actually saved UTF-8 data in ISO-8859-1 tables.

like image 67
Emil Vikström Avatar answered Sep 17 '22 20:09

Emil Vikström


I would look into mb_detect_encoding() and mb_convert_encoding() and see if they can help you.

like image 41
AlienWebguy Avatar answered Sep 21 '22 20:09

AlienWebguy