How can I convert an international (e.g. Russian) String to \u
numbers (unicode numbers)
e.g. \u041e\u041a
for OK
?
In order to convert Unicode to UTF-8 in Java, we use the getBytes() method. The getBytes() method encodes a String into a sequence of bytes and returns a byte array. Declaration - The getBytes() method is declared as follows.
According to section 3.3 of the Java Language Specification (JLS) a unicode escape consists of a backslash character (\) followed by one or more 'u' characters and four hexadecimal digits. So for example \u000A will be treated as a line feed.
Internally in Java all strings are kept in Unicode. Since not all text received from users or the outside world is in unicode, your application may have to convert from non-unicode to unicode.
there is a JDK tools executed via command line as following :
native2ascii -encoding utf8 src.txt output.txt
Example :
src.txt
بسم الله الرحمن الرحيم
output.txt
\u0628\u0633\u0645 \u0627\u0644\u0644\u0647 \u0627\u0644\u0631\u062d\u0645\u0646 \u0627\u0644\u0631\u062d\u064a\u0645
If you want to use it in your Java application, you can wrap this command line by :
String pathSrc = "./tmp/src.txt"; String pathOut = "./tmp/output.txt"; String cmdLine = "native2ascii -encoding utf8 " + new File(pathSrc).getAbsolutePath() + " " + new File(pathOut).getAbsolutePath(); Runtime.getRuntime().exec(cmdLine); System.out.println("THE END");
Then read content of the new file.
You could use escapeJavaStyleString
from org.apache.commons.lang.StringEscapeUtils
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With