I have a string "\\u003c", which belongs to UTF-8 charset. I am unable to decode it to unicode because of the presence of double backslashes. How do i get "\u003c" from "\\u003c"? I am using java.
I tried with,
myString.replace("\\\\", "\\");
but could not achieve what i wanted.
This is my code,
String myString = FileUtils.readFileToString(file);
String a = myString.replace("\\\\", "\\");
byte[] utf8 = a.getBytes();
// Convert from UTF-8 to Unicode
a = new String(utf8, "UTF-8");
System.out.println("Converted string is:"+a);
and content of the file is
\u003c
You can use String#replaceAll
:
String str = "\\\\u003c";
str= str.replaceAll("\\\\\\\\", "\\\\");
System.out.println(str);
It looks weird because the first argument is a string defining a regular expression, and \
is a special character both in string literals and in regular expressions. To actually put a \
in our search string, we need to escape it (\\
) in the literal. But to actually put a \
in the regular expression, we have to escape it at the regular expression level as well. So to literally get \\
in a string, we need write \\\\
in the string literal; and to get two literal \\
to the regular expression engine, we need to escape those as well, so we end up with \\\\\\\\
. That is:
String Literal String Meaning to Regex −−−−−−−−−−−−−−−−−−−−− −−−−−−−−−−−−−−−−−−−−−−−−−−− −−−−−−−−−−−−−−−−− \ Escape the next character Would depend on next char \\ \ Escape the next character \\\\ \\ Literal \ \\\\\\\\ \\\\ Literal \\
In the replacement parameter, even though it's not a regex, it still treats \
and $
specially — and so we have to escape them in the replacement as well. So to get one backslash in the replacement, we need four in that string literal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With