I've just got a db in postgreSQL for my project and just realized it's in SQL_ASCII encoding, which means "no encoding" I think.
So what is the simplest way to convert this to utf8? And I know the db should be in latin1, does the conversion will damage the content?
Thanks!
The character set support in PostgreSQL allows you to store text in a variety of character sets (also called encodings), including single-byte character sets such as the ISO 8859 series and multiple-byte character sets such as EUC (Extended Unix Code), UTF-8, and Mule internal code.
ScaleGrid PostgreSQL deployments use UTF-8 as the default encoding on both client and server side. The template1 database is UTF-8 encoded, and uses the en_US. UTF-8 locale. By default any databases you create will also inherit this encoding.
One of the interesting features of PostgreSQL database is the ability to handle Unicode characters. In SQL Server, to store non-English characters, we need to use NVARCHAR or NCAHR data type. In PostgreSQL, the varchar data type itself will store both English and non-English characters.
Converting to UTF8 should not damage your data as (I believe) all characters in SQL_ASCII also exist in utf8; they just have different byte codes.
Your best bet is to re-build your database. That is dump it, create a utf8 database then restore the dump to that new database.
postgres pg_dump --encoding utf8 main -f main.sql
createdb -E utf8 newMain
psql -f main.sql -d newMain
You can then of course rename the databases once you are happy that the new UTF8 one matches your data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With