I have a database on MS Access, that I use with PHP through a call with PDO and the odbc driver. I have French, Danish and Polish words in my database. No problem for French and Danish, but no way to have the Polish characters, I only get "?" instead.
Here is the code:
try{
$db = new PDO("odbc:DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;");
}
catch(PDOException $e){
echo $e->getMessage();
}
$answer = $db -> query("SELECT * FROM dict_main WHERE ID < 20");
while($data = $answer-> fetch() ){
echo iconv("iso-8859-1","utf-8",htmlspecialchars($data['DK'])) . ' ';
echo iconv("iso-8859-2","utf-8",htmlspecialchars($data['PL'])) . ' ';
echo iconv("iso-8859-1","utf-8",htmlspecialchars($data['FR'])) . ' ';
}
Please let me know if somebody has an idea, as I am running out of them and nothing seems to work, or if I should give more information about my problem that I didn't think of.
byte[] utf8 = ... byte[] latin1 = new String(utf8, "UTF-8"). getBytes("ISO-8859-1"); You can exercise more control by using the lower-level Charset APIs. For example, you can raise an exception when an un-encodable character is found, or use a different character for replacement text.
UTF-8 is a multibyte encoding that can represent any Unicode character. ISO 8859-1 is a single-byte encoding that can represent the first 256 Unicode characters. Both encode ASCII exactly the same way.
If you find a byte with its high-order bit set, where the bytes both immediately before and immediately after it don't have their high-order bit set, you know it's ISO encoded (because bytes >127 always occur in sequences in UTF-8).
ISO 8859-1 is the ISO standard Latin-1 character set and encoding format. CP1252 is what Microsoft defined as the superset of ISO 8859-1. Thus, there are approximately 27 extra characters that are not included in the standard ISO 8859-1.
It looks like htmlspecialchars()
does not support ISO-8859-2. So it probably breaks the contents of $data['PL']
before it gets to iconv()
.
Try first converting the input string into UTF-8, then apply htmlspecialchars()
to the UTF-8 string:
echo htmlspecialchars( iconv("iso-8859-2", "utf-8", $data['PL']) );
You are using PHP 5.3.13. Then i would expect the charset in new POD
to do its job. (Prior to 5.3.6. you would have to use $db->exec("set names utf8");
). So add the charset=utf8;
to your connect line. I also expect your Access database to be UTF-8.
You can also try charset=ucs2;
with and without htmlspecialchars( iconv("iso-8859-2", "utf-8", $data['PL']) );
$db = new PDO("odbc:DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;charset=utf8;");
or
$db = new PDO("odbc:DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;charset=ucs2;");
B.T.W.: Don't forget to set your output to UTF-8 at the top of your document.
<?php header('Content-Type:text/html; charset=UTF-8'); ?>
and/or
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
If that still doesn't work i suspect that the encoding in your Access database is messed up.
Only thing i can think of at this point is using odbc_connect directly and bypassing PDO but i think the problem is in ODBC (Access->ODBC). If that's the case this won't help:
$conn=odbc_connect("DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=$dbName; Uid=Admin;Pwd=;charset=utf8", "", "");
$rs=odbc_exec($conn, "SELECT * FROM dict_main WHERE ID < 20");
odbc_result_all($rs,"border=1");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With