Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL European Characters

I can't figure this out for the life of me.

I have a query that pulls translations of elements on a page. So any number of 15 languages can appear on that page. When I start to add languages like Swedish anything that has a symbol such as ö results in the whole field returning a null string.

I've verified the encoding on the table and it claims it's using UTF-8 but seeing as how this doesn't work I'm confused.

Here is the query I'm working with:

SELECT
   form.form_id,
   elem.elem_type,
   elem.elem_name,
   elem.elem_format,
   elem.elem_required,
   trans.trans_label,`
   trans.trans_description,
   trans.trans_defaultValue,
   trans.trans_other,
   elem.elem_advancedcommand
FROM
   events_form form
LEFT JOIN
   events_form_elements elem
ON
   form.event_id = elem.event_id
INNER JOIN
   events_form_translations trans
ON
   elem.elem_id = trans.elem_id
INNER JOIN
   events_form_languages lang
ON
   trans.lang_id = lang.lang_id
WHERE 
   form.form_id = '{$formid}' AND lang.language = '{$language}'
ORDER BY
   elem.elem_sortorder

Now I tried to do something like:

CONVERT(CAST(trans.trans_description as BINARY) USING latin1) as trans_description,

To force it to covert the encoding but that doesn't yield a result at all.

After I get the result it's immediately json_encoded and returned to the user (Ajax Request). I DON'T think it's the json_encode as doing a print_r of the output array yields the same issues.

Also.. lastly, the system I'm building on is using xPDO so I'm not too sure if that's the issue either.

EDIT: It seems that PHP IS returning a correct value or at least a value for example here is a print_r dump:

[trans_label] => Ditt f�rnamn?
[trans_description] => 
[trans_defaultValue] => First Name

So it seems that when my json_encode touches that string is when it turns the string to null.

like image 636
Peter Avatar asked Jul 13 '12 22:07

Peter


People also ask

What is the difference between UTF-8 and Latin1?

what is the difference between utf8 and latin1? They are different encodings (with some characters mapped to common byte sequences, e.g. the ASCII characters and many accented letters). UTF-8 is one encoding of Unicode with all its codepoints; Latin1 encodes less than 256 characters.

What is the difference between utf8mb4 and UTF-8 charset in MySQL?

utf-8 can store only 1, 2 or 3 bytes characters, while utf8mb4 can store 4 bytes characters as well. utf-8 is a subset of characters given by utf8mb4 .

What is UTF-8 in MySQL?

MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character. utf8mb3 : A UTF-8 encoding of the Unicode character set using one to three bytes per character. This character set is deprecated in MySQL 8.0, and you should use utfmb4 instead.

What are MySQL characters?

The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. For example, CHAR(30) can hold up to 30 characters. The length of a CHAR column is fixed to the length that you declare when you create the table. The length can be any value from 0 to 255.


2 Answers

Your PDO connection string should specify the encoding. For example:

mysql:host=localhost;port=3306;dbname=test;charset=utf8

This controls the encoding that the database driver will use when it returns a result, and the encoding the driver assumes your queries are in. If you don't specify it, the default encoding will be used. Often the default is latin1.

You can confirm this by printing the hexadecimal representation of the data with bin2hex in PHP: the ö in förnamn is being returned as f6. If the text was encoded in UTF-8 you would obtain c3b6.

like image 177
Joni Avatar answered Oct 05 '22 20:10

Joni


You said nothing about the encoding of your web pages.
Do you have that line in the <head> section of your page to force the encoding to UTF-8?

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
like image 27
Jocelyn Avatar answered Oct 05 '22 21:10

Jocelyn