Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

notepad ++ shows ucs-2LE while ubuntu FILE [file] shows UTF-16LE, I am confused?

I am trying to convert the file generated from a mssql to utf-8. When I open the output of he mssql using notepad++ in windows server 2003 recognises the file as UCS-2LE I copied the file to a Ubuntu machine, using file [file] it shows that the encoding is UTF-16LE. Really confused, there must be some difference in encoding, as the names are different. But why do I see this in the same file. Its a .csv file generated from the mssql query.

like image 852
tough Avatar asked Jul 31 '12 08:07

tough


People also ask

Does Notepad++ support UTF 16?

First, Notepad++ doesn't even support UTF-16. It's (as it says) UCS-2. But while UTF-16 is backwards compatible to UCS-2, these two are not the same. UCS-2 always saves characters (CodePoints) within 2 bytes.

How do I change the encoding in Notepad ++?

Open the file you want to verify/fix in Notepad++ In the top menu select Encoding > Convert to UTF-8 (option without BOM) Save the file.

What is UCS-2 LE BOM encoding?

UCS-2 is a character encoding standard in which characters are represented by a fixed-length 16 bits (2 bytes). It is used as a fallback on many GSM networks when a message cannot be encoded using GSM-7 or when a language requires more than 128 characters to be rendered.


1 Answers

For the most part, UTF-16 and UCS-2 are the same thing. There is no difference.

What it means is that each character is two bytes wide. "LE" stands for little endian, i.e. each two-byte character is stored with the low byte first.

If you want to convert to UTF-8, in Notepad++ click Convert to UTF-8 in the Encoding menu, then save.

If your other programs choke on the file after doing this, or you see two garbage characters at the start of the file, then click Convert to UTF-8 without BOM instead.

like image 130
BenW Avatar answered Dec 27 '22 11:12

BenW