Read file and write file which has characters in UTF - 8 (different language)

Tags:

java-io

I have a file which has characters like: " Joh 1:1 ஆதியிலே வார்த்தை இருந்தது, அந்த வார்த்தை தேவனிடத்திலிருந்தது, அந்த வார்த்தை தேவனாயிருந்தது. "

www.unicode.org/charts/PDF/U0B80.pdf‎

When I use the following code:

bufferedWriter = new BufferedWriter (new OutputStreamWriter(System.out, "UTF8"));

The output is boxes and other weird characters like this:

"�P�^��O֛��;�<�aYՠ؛"

Can anyone help?

these are the complete codes:

File f=new File("E:\\bible.docx");
        Reader decoded=new InputStreamReader(new FileInputStream(f), StandardCharsets.UTF_8);
        bufferedWriter = new BufferedWriter (new OutputStreamWriter(System.out, StandardCharsets.UTF_8));
        char[] buffer = new char[1024];
        int n;
        StringBuilder build=new StringBuilder();
        while(true){
            n=decoded.read(buffer);
            if(n<0){break;}
            build.append(buffer,0,n);
            bufferedWriter.write(buffer);
        }

enter image description here

The StringBuilder value shows the UTF characters but when displaying it in the window it shows as boxes..

Found the Answer to the problem!!! The Encoding is Correct (i.e UTF-8) Java reads the file as UTF-8 and the String characters are UTF-8, The problem is that there is no font to display it in netbeans' output panel. After changing the font for the output panel (Netbeans->tools->options->misc->output tab) I got the expected result. The same applies when it is displayed in JTextArea(font needs to be changed). But we can't change font the windows' cmd prompt.

733

asked Aug 01 '13 03:08

Alfa

2 Answers

Because your output is encoded in UTF-8, but still contains the replacement character (U+FFFD, �), I believe the problem occurs when you read the data.

Make sure that you know what encoding your input stream uses, and set the encoding for the InputStreamReader according. If that's Tamil, I would guess it's probably in UTF-8. I don't know if Java supports TACE-16. It would look something like this…

StringBuilder buffer = new StringBuilder();
try (InputStream encoded = ...) {
  Reader decoded = new InputStreamReader(encoded, StandardCharsets.UTF_8);
  char[] buffer = new char[1024];
  while (true) {
    int n = decoded.read(buffer);
    if (n < 0)
      break;
    buffer.append(buffer, 0, n);
  }
}
String verse = buffer.toString();

answered Oct 27 '22 09:10

erickson

System.out is too near to the operating system, to be versatile enough. In your case, the NetBeans console probably is using the operating system encoding, and IDE picked font.

Write to a file first. If you make it HTML, you can even double click it, and specify internally the right encoding. Mind using "UTF-8" then, as "UTF8" is Java specific ("UTF-8" can be used in Java too). Maybe with JDesktop.getDesktop().open("... .html");.

A small JFrame with a JTextPane would do too.

answered Oct 27 '22 09:10

Joop Eggen

Related questions
                            
                                Java Swing register to event type
                            
                                list all artifacts from a maven repository
                            
                                bounded generic method with 'super' type
                            
                                How to design a simple CRUD REST API
                            
                                Best practice integrating Spock with Jenkins, Sonar [closed]
                            
                                How to minimize, maximize and restore down through buttons in java?
                            
                                Java regex error - Look-behind with group reference
                            
                                Casting a generic superclass to a subclass
                            
                                How do I send bundled cards all at the same time?
                            
                                Is this a resource leak or a false positive?
                            
                                How to include Zxing library to android project?
                            
                                Executable Files Like Launch4j for Other Platforms [closed]
                            
                                Using getResource with ProGuard results in null result
                            
                                JPA with EJB with separated DAO and Service Layers
                            
                                Confused - Height of Binary tree
                            
                                View Java GUI inside .NET Form / Visual Studio editor?
                            
                                Maven 2.1 dependency:analyze. Transitive dependencies: Used undeclared dependencies found
                            
                                How can I test several exceptions within one test using an ExpectedException Rule?
                            
                                Java: Getting the subclass from a superclass list
                            
                                How to specify a default value for an enum using avro IDL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With