I've just created my first ajax function with jQuery which actually works, but unfortunately the character encoding (for characters like ä, ö, ü, ß, č, ć, å, ø) is a nightmare.
My files and my database are all UTF-8. I've tried a multitude of options in the ajax function and the PHP function, none of which were satisfactory.
This is my ajax
var dataString = {
'name': name,
'mail': mail
// other stuff
}
$.ajax({
type: "POST",
url: "/post.php",
data: dataString,
contentType: "application/x-www-form-urlencoded;charset=UTF-8",
cache: false,
success: function(html){
// do stuff
}
I've tried it without contentType: "application/x-www-form-urlencoded;charset=UTF-8" and I've tried to wrap the affected data in encodeURIComponent(), none of which worked.
When I use that AJAX with htmlentities() in my php, my umlauts look like this in plain text: UE �, AE �, OE �, ue ü, ae ä, oe o
And like this in the database: UE Ü , AE Ä, OE Ö, ue ü, ae ä, oe o
If I don't use htmlentities() but mysql_real_escape_string() instead (or neither), they look good in plain text, but they look like this in the database: AE Ä, OE Ö, UE Ü, ae ä oe ö ue ü
I've been trying tons of options for hours now, but I can't find a solution that works. So far the only option I seem to have is having them look like a total mess in the database, but that would be very contraproductive if those data sets need to be edited.
I've tried to wrap the affected data in encodeURIComponent()
Nah, if you're passing in a {}
object, jQuery will take care of UTF-8 and URL-encoding it for you.
When I use that AJAX with htmlentities() in my php, my umlauts look like this in plain text: UE �, AE �, OE �, ue ü, ae ä, oe o
If you must use htmlentities()
, you have to tell it your encoding is UTF-8
in the optional $charset
argument, else it will (stupidly) default to treating all your bytes as ISO-8859-1, and encode them to inappropriate entity references, one for each byte.
Better is to use htmlspecialchars()
instead, as it does not attempt to apply unnecessary encoding to characters other than the few ASCII characters that really need it.
And like this in the database: UE Ü , AE Ä, OE Ö, ue ü, ae ä, oe o
How are you determining that? Does the tool you are using to grab data out of the database know about Unicode? (If it's a dodgy PHP web admin interface, maybe not. PHP isn't great at Unicode.)
It is possible that you're storing proper UTF-8 bytes in the database, but in tables marked as having a Latin-1 collation. This will work, in as much as you'll get the same bytes out as you put in, but if MySQL doesn't know they're UTF-8 bytes then case-insensitive string comparisons outside the ASCII range won't work right, so looking for Ä
won't match ä
. That may or may not matter to you.
If I don't use htmlentities() but mysql_real_escape_string() instead
Whoah, careful. HTML-escaping is for the output stage to the page. SQL-string-literal-escaping occurs when creating an SQL query. You need them both, but don't mix them up or attempt to do them at the same stage, or you'll have all sorts of weird escapes-gone-wrong and potential vulnerabilities.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With