Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read from keyboard in UTF-8

Tags:

java

I need to read input from the user, and I want to have support for non-latin letters, such as Å, Ä and Ö.

BufferedReader keyboard = new BufferedReader(new InputStreamReader(System.in));
PrintWriter out = new PrintWriter(new OutputStreamWriter(System.out, "UTF-8"), true);
out.println(keyboard.readLine());
out.println("Read with charset: " + Charset.defaultCharset().name());

When I run this code, and input a latin letter it works as expected (I enter something, press enter and it prints out what I entered). But if I try with å I get this:

å

�
Read with charset: UTF-8

I have to hit enter twice if the text ends with a non-latin letter, and then it doesn't display them right. I have tried this in Netbeans' console, and in Windows command prompt and neither gives expected results.


I could not find a solution with UTF-8, but went with ISO-8859-1 instead. It worked with my Netbeans console (which should definitely be UTF-8) and in CMD when I first ran chcp 28591, changed the font (it was necessary in my case) and ran my program.

like image 917
Dan Lindqvist Avatar asked Nov 10 '22 17:11

Dan Lindqvist


1 Answers

The code sample is not encoding properly in any way. It is reading in data from the console using the system default and then writing it out using UTF-8. Your system default may not be UTF-8 and to complicate things, your console may or may not be the same as your system default.

To do this correctly in the console, you would need to read in using your console encoding, and write out using your console encoding. If you are just testing this and need to write out to a file, for example, write it as UTF-8 and make sure you open it with a text-editor as UTF-8.

like image 110
Necreaux Avatar answered Nov 15 '22 06:11

Necreaux