I want to extract the plain text from given HTML code. I tried using regex
and got
String target = val.replaceAll("<a.*</a>", "");
.
My main requirement is I want remove everything between <a>
and </a>
(including the Link name). While using the above code all other contents also removed.
<a href="www.google.com">Google</a>
This is a Google Link
<a href="www.yahoo.com">Yahoo</a>
This is a Yahoo Link
Here I want to remove the values between <a>
and </a>
.
Final output should
This is a Google Link This is a Yahoo Link
Use a non-greedy quantifier (*?
). For example, to remove the link entirely:
String target = val.replaceAll("<a.*?</a>", "");
Or to replace the link with just the link tag's contents:
String target = val.replaceAll("<a[^>]*>(.*?)</a>", "This is a $1 Link");
However, I would still recommend using a proper DOM manipulation API.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With