Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Notepad++ can recognize encoding?

I created file with UTF-8 encoded content (using PHP fputcsv).

When I open this file in Notepad++ - characters are wrong (Notepad++ starts with ANSI encoding).

When I set Format->"Encode in UTF-8" from menu - everything is fine.

Im worrying, that Notepad++ can recognize encoding somehow, and maybe something is wrong with my file created with fputcsv? First byte or something?

like image 293
Kamil Avatar asked Jan 09 '13 21:01

Kamil


People also ask

How can I tell the encoding of a Notepad file?

Open up your file using regular old vanilla Notepad that comes with Windows. It will show you the encoding of the file when you click "Save As...". Whatever the default-selected encoding is, that is what your current encoding is for the file.

What encoding is used in Notepad?

Files by default, are encoded in Notepad with either ANSI or UTF-8 (depending on the Notepad version). ANSI encoding generally is used for the Latin character set (including the English alphabet), and UTF-8 supports the Unicode character set (a global character set).

How do I view UTF-8 in Notepad?

Notepad can manage text encoded in several formats such as ANSI, Unicode and UTF-8. Find these options by clicking the "Encoding" button on Notepad's Save As window. After creating or updating text in a document, you can select one of these encoding options in which to save the file.


2 Answers

Automatically detecting an encoding is not something that can be done accurately. It's pretty much essential that the encoding be specified explicitly. It can be guessed in some cases, but even then not with 100% certainty.

This documentation (Encoding) explains the situation in relation to Notepad++. They also point out that the difficulty arises especially if the file has not been saved with a Byte Order Mark (BOM).

Given that your file displays correctly once you manually set the encoding, I would say there's nothing wrong with how you are generating and saving the file. The only thing you can check for is whether a BOM is being saved, which might improve the chances of Notepad++ being able to automatically detect the encoding.

It's worth noting that although it may help editors like Notepad++ identify the encoding more accurately, according to The Unicode Standard document, the BOM is not recommended.

like image 130
Chamila Chulatunga Avatar answered Oct 22 '22 03:10

Chamila Chulatunga


You have to check the lower right corner of the Notepad++ GUI to see the actual enconding that is being used. The problem it's not that Notepad++ specific because guessing the right encoding is a big problem without any real solution so it's better to let the user decide what is the most appropriate encoding in each single case.

like image 24
user1824407 Avatar answered Oct 22 '22 03:10

user1824407