Is there way to check is text file (.txt) encoded with Unicode or UTF-8 with Java?
Open the file with Notepad++ and will see on the right down corner the encoding table name. And in the menu encoding you can change the encoding table and save the file.
To verify if a file passes an encoding such as ascii, iso-8859-1, utf-8 or whatever then a good solution is to use the 'iconv' command.
Encoding. The ASCII character set is the most common compatible subset of character sets for English-language text files, and is generally assumed to be the default file format in many situations.
You cannot know with absolute certainty which charset is used in the general case. I found this to be a good read.
http://illegalargumentexception.blogspot.co.uk/2009/05/java-rough-guide-to-character-encoding.html
Especially the section Automatic detection of encoding.
Uhm, theoretically, how would you know if it is unicode?
This is the real question. Truthfully, you cannot know, but you can make a decent guess.
See: Java : How to determine the correct charset encoding of a stream for more details. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With