Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Text encoding in HTML text fields

Tags:

html

php

encoding

I have a site up that has a form on it. The form POSTs to a php script which then inserts the data into my database. The page has a charset=UTF-8 attribute in the <meta> tag, and the database is setup to use UTF-8. However, when I copy and paste characters from MS Word into the field, the output is messed up.

For example, the quotes in

I am using "Microsoft Word" ''''

become

I am using “Microsoft Word†????

in the database.

Anyone have any idea why this might occur?

like image 852
joe Avatar asked May 11 '26 13:05

joe


1 Answers

Here's what I propose you do to find where the problem lies.

  1. MySQL uses charset Latin1 to store and transfer in/out data per default. To change that, do the following. Create your database with charset UTF8/collation utf8_unicode_ci (see http://dev.mysql.com/doc/refman/5.0/en/create-database.html).

    CREATE DATABASE example DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_unicode_ci;

  2. Tell MySQL to handle in/out data as UTF8. Before any SQL queries are sent to MySQL the command SET NAMES UTF8; must be made. This tells MySQL to accept and handle all in/out data to the server as UTF8. This needs to be set only once per connection. You can set this with mysql_query("SET NAMES 'UTF8'"); for example.

  3. Make sure you're actually using UTF8. Altough you might have specified UTF8 in the <meta> tag, you might acually be sending the content in another charset. To make sure you're sending UTF8 encoded content, add header('Content-Type: text/html; charset=utf-8'); to your PHP file.

like image 71
Erik Töyrä Silfverswärd Avatar answered May 14 '26 02:05

Erik Töyrä Silfverswärd



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!