I need to clear my string from the following substrings:
\n
\uXXXX
(X
being a digit or a character)
e.g. "OR\n\nThe Central Site Engineering\u2019s \u201cfrontend\u201d, where developers turn to"
-> "OR The Central Site Engineering frontend , where developers turn to"
I tried using the String method replaceAll but dnt know how to overcome the \uXXXX issue as well as it didnt work for the \n
String s = "\\n";
data=data.replaceAll(s," ");
how does this regex looks in java?
thanks for the help
Problem with string.replaceAll("\\n", " ");
is that replaceAll
expects regular expression, and \
in regex is special character used for instance to create character classes like \d
which represents digits, or to escape regex special characters like +
.
So if you want to match \
in Javas regex you need to escape it twice:
\\
"\\\\"
.like replaceAll("\\\\n"," ")
.
You can also let regex engine do escaping for you and use replace
method like
replace("\\n"," ")
Now to remove \uXXXX
we can use
replaceAll("\\\\u[0-9a-fA-F]{4}","")
Also remember that Strings are immutable, so each str.replace..
call doesn't affect str
value, but it creates new String. So if you want to store that new string in str
you will need to use
str = str.replace(..)
So your solution can look like
String text = "\"OR\\n\\nThe Central Site Engineering\\u2019s \\u201cfrontend\\u201d, where developers turn to\"";
text = text.replaceAll("(\\\\n)+"," ")
.replaceAll("\\\\u[0-9A-Ha-h]{4}", "");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With