Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reading text file with utf-8 encoding using java

I have problem in reading text file with utf-8 encoding I'm using java with netbeans 7.2.1 platform

I already configured the java project to handle UTF-8 javaproject==>right click==>properties==>source==>UTF-8

but still get the unknown character output: ����� �������� ���� �

the code:

File fileDirs = new File("C:\\file.txt");

BufferedReader in = new BufferedReader(
new InputStreamReader(new FileInputStream(fileDirs), "UTF-8"));

String str;

while ((str = in.readLine()) != null) {
    System.out.println(str);
}

any other ideas?

thanks

like image 905
Abrial Avatar asked Feb 17 '13 05:02

Abrial


4 Answers

Use

    import java.io.BufferedReader;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.IOException;
    import java.io.InputStreamReader;
    import java.io.UnsupportedEncodingException;     
    public class test {
    public static void main(String[] args){

    try {
        File fileDir = new File("PATH_TO_FILE");

        BufferedReader in = new BufferedReader(
           new InputStreamReader(new FileInputStream(fileDir), "UTF-8"));

        String str;

        while ((str = in.readLine()) != null) {
            System.out.println(str);
        }

                in.close();
        } 
        catch (UnsupportedEncodingException e) 
        {
            System.out.println(e.getMessage());
        } 
        catch (IOException e) 
        {
            System.out.println(e.getMessage());
        }
        catch (Exception e)
        {
            System.out.println(e.getMessage());
        }
    }
}

You need to put UTF-8 in quotes

like image 100
Shobhit Sharma Avatar answered Oct 20 '22 07:10

Shobhit Sharma


You need to specify the encoding of the InputStreamReader using the Charset parameter.

Charset inputCharset = Charset.forName("ISO-8859-1");
InputStreamReader isr = new InputStreamReader(fis, inputCharset));

This is work for me. i hope to help you.

like image 44
jinkal Avatar answered Oct 20 '22 07:10

jinkal


You are reading the file right but the problem seems to be with the default encoding of System.out. Try this to print the UTF-8 string-

PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.println(str);
like image 10
MoveFast Avatar answered Oct 20 '22 06:10

MoveFast


I ran into the same problem every time it finds a special character marks it as ��. to solve this, I tried using the encoding: ISO-8859-1

BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("txtPath"),"ISO-8859-1"));

while ((line = br.readLine()) != null) {

}

I hope this can help anyone who sees this post.

like image 4
Joshua Joel Cleveland Avatar answered Oct 20 '22 07:10

Joshua Joel Cleveland