Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I make eclipse print out weird characters in unicode?

So I'm trying to make my program output a text file with a list of names. Some of the names have weird characters, such as Åström.

I have grabbed these list of names from a webpage that is encoded in "UTF-8", or at least I'm pretty sure it does because the page source says

" meta http-equiv="Content-Type" content="text/html; charset=UTF-8" / "

This is what I've tried so far:

public static void write(List<String> list) throws IOException  {
        Writer out = new OutputStreamWriter(new FileOutputStream("test.txt"), "UTF-8");
        try {
            for (int i=0;i<list.size();i++) {
                try {
                    byte[] utf8Bytes = list.get(i).getBytes("UTF-8");
                    out.write(new String(utf8Bytes, "UTF-8"));
                } catch (UnsupportedEncodingException e) {
                    e.printStackTrace();
                }

                out.write(System.getProperty("line.separator"));

            }
        }
        finally {
        out.close();
        }
    }

and I'm a little confused as to why it's not working. The output I get is "Åström", which is very weird.

Can someone please point me in the right direction? Thanks!

And on another unrelated note, is there an easier way to write a new line to a text file besides the clunky

out.write(System.getProperty("line.separator"));

that I have? I saw that online somewhere and it works, but I was just wondering if there was a cleaner way.

like image 344
wynnch Avatar asked Jun 04 '11 00:06

wynnch


People also ask

Does Eclipse support Unicode?

Eclipse by default does not support Unicode / UTF-8. Turning it on, is very easy if you know where it is. At the bottom of that screen you can see 'Text file encoding'. Just choose the relevent one (choices range from US-ASCII to UTF-8), click the one you want, and presto, away you go.


2 Answers

Set your Eclipse > Preferences > General > Workspace > Text file encoding to UTF-8.

like image 121
trashgod Avatar answered Oct 08 '22 19:10

trashgod


The content is indeed in UTF-8 and it appears OK if printed to the console. What may be causing the problem is the decoding and encoding of the string which is unnecessary. Instead of an OutputStreamWriter try using a java.io.PrintWriter. It has the println methods that print out the string with the system line separator at the end. It would look something like:

printStream.println(list.get(i));

Also, when opening the file to see it try using a browser. They allow you to choose the encoding after opening it so you can try several encodings quickly to see what is being really used.

like image 36
Javier C Avatar answered Oct 08 '22 21:10

Javier C