i am trying to read Unicode characters from a text file saved in utf-8 using java my text file is as follows
अ, अदेबानि ,अन, अनसुला, अनसुलि, अनफावरि, अनजालु, अनद्ला, अमा, अर, अरगा, अरगे, अरन, अराय, अलखद, असे, अहा, अहिंसा, अग्रं, अन्थाइ, अफ्रि, बियन, खियन, फियन, बन, गन, थन, हर, हम, जम, गल, गथ, दरसे, दरनै, थनै, थथाम, सथाम, खफ, गल, गथ, मिख, जथ, जाथ, थाथ, दद, देख, न, नेथ, बर, बुंथ, बिथ, बिख, बेल, मम, आ, आइ, आउ, आगदा, आगसिर
i have tried with the code as followed
import java.io.*;
import java.util.*;
import java.lang.*;
public class UcharRead
{
public static void main(String args[])
{
try
{
String str;
BufferedReader bufReader = new BufferedReader( new InputStreamReader(new FileInputStream("research_words.txt"), "UTF-8"));
while((str=bufReader.readLine())!=null)
{
System.out.println(str);
}
}
catch(Exception e)
{
}
}
}
getting out put as ???????????????????????? can anyone help me
You are (most likely) reading the text correctly, but when you write it out, you also need to enable UTF-8. Otherwise every character that cannot be printed in your default encoding will be turned into question marks.
Try writing it to a File instead of System.out (and specify the proper encoding):
Writer w = new OutputStreamWriter(
new FileOutputStream("x.txt"), "UTF-8");
If you are reading the text properly using UTF-8 encoding then make sure that your console also supports UTF-8. In case you are using eclipse then you can enable UTF-8 encoding foryour console by:
Run Configuration->Common -> Encoding -> Select UTF 8
Here is the eclipse screenshot.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With