Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I get invalid characters when converting MS SQL Data to MYSQL?

I'm writing a PHP script to import data into a MYSQL database from a Microsoft SQL Server 2008 database.

The MSSQL Server is set with a collation of "SQL_Latin1_General_CP1_CI_AS" and the data in question is being stored in a column of the type "nchar".

My PHP web pages use

<meta http-equiv="content-type" content="text/html; charset=utf-8">

to indicate that they should be displayed with UTF-8 Character encoding.

I'm pulling the data from the MSSQL database using the sqlsrv PHP extension.

$sql = 'SELECT * FROM [tArticle] WHERE [ID] = 6429';
$stmt = &sqlsrv_query($dbHandler, $sql);

while ($row = sqlsrv_fetch_object($stmt)) {
  // examples of what I've tried simply to display the data
  echo $row->Text1;
  echo utf8_encode($row->Text1);
  echo iconv("ISO-8859-1", "UTF-8", $row->Text1);
  echo iconv("ISO-8859-1", "UTF-8//TRANSLIT", $row->Text1);
}

Forget about inserting the data into the MYSQL database for now. I can't get the string to display properly in my PHP page. From the examples in my listing:

echo $row->Text1

is rendered by my browser as an obviously invalid character: "Lucy�s"

all of the examples following that one are rendered as blanks: "Lucys"

It looks like a character set mismatch problem to me but how can I get this data to display properly from the MS SQL database (without changing my web-page encoding)? If I can figure that out I can probably work out the storing it in the MYSQL database part.

like image 831
rushinge Avatar asked Jan 11 '11 22:01

rushinge


People also ask

What are invalid characters in SQL?

SQL Server reserves both the uppercase and lowercase versions of reserved words. Embedded spaces or special characters are not allowed. Supplementary characters are not allowed.

Is SQL a UTF-8?

SQL Server has long supported Unicode characters in the form of nchar, nvarchar, and ntext data types, which have been restricted to UTF-16. You could get UTF-8 data into nchar and nvarchar columns, but this was often tedious, even after UTF-8 support through BCP and BULK INSERT was added in SQL Server 2014 SP2.

Does SQL Server support UTF-16?

Microsoft SQL Server and Microsoft SQL Server Express do not support UTF-8 at the database level. They support nchar, nvarchar, and ntext to store fixed format Unicode data (UTF-16).

What character encoding does SQL Server use?

For more information, see the Binary collations section in this article. Enables UTF-8 encoded data to be stored in SQL Server. If this option isn't selected, SQL Server uses the default non-Unicode encoding format for the applicable data types.


2 Answers

If the strings in the source database are encoded in UTF-8, you should use utf8_decode, not utf8_encode.

But they're probably encoded in some Latin or "Western" Windows code page. So I would try iconv("CP1252", "UTF-8", $row->Text1);, for example.

Another alternative is to run a SQL query that explicitly sets a known encoding. For example, according to the Windows Collation Name (Transact-SQL) documentation, this query would use code page 1252 to encode field Text1: SELECT Text1 COLLATE SQL_Latin1_General_CP1_CI_AS FROM ....

like image 165
scoffey Avatar answered Sep 30 '22 13:09

scoffey


try this command it's working for me :

$connectionInfo = array( "Database"=>"DBName", "CharacterSet" =>"UTF-8"); 
like image 39
Shahram Foroud Avatar answered Sep 30 '22 12:09

Shahram Foroud