Removing tag using regex

Question

I want to extract the plain text from given HTML code. I tried using regex and got

String target = val.replaceAll("<a.*</a>", "");.

My main requirement is I want remove everything between <a> and </a> (including the Link name). While using the above code all other contents also removed.

<a href="www.google.com">Google</a> This is a Google Link

<a href="www.yahoo.com">Yahoo</a> This is a Yahoo Link

Here I want to remove the values between <a> and </a>. Final output should

This is a Google Link This is a Yahoo Link

p.s.w.g · Accepted Answer

Use a non-greedy quantifier (*?). For example, to remove the link entirely:

String target = val.replaceAll("<a.*?</a>", "");

Or to replace the link with just the link tag's contents:

String target = val.replaceAll("<a[^>]*>(.*?)</a>", "This is a $1 Link");

However, I would still recommend using a proper DOM manipulation API.

Removing <a href > tag using regex

Tags:

html

regex

Sathesh S

1 Answers

p.s.w.g

Recent Activity

Donate For Us