Recently I am working on a android project. I am parsing data from wordpress api. But detail post content are in html formet. I have to remove html tags. Using Html.fromHtml().toString() java method I deleted all tags. But there are some image caption which I have to delete. For delete the caption I have to find tag class. So how can I delete this content using Html Class?
<p class="wp-caption-text">android m marshmallow</
EDIT :
Using regular Expression I solved My problem.
Insert Your specific Html in Regex and you will get your Regular Expression.
yourHtml = yourHtml.replaceAll("Your_Regular_Expression","");
yourHtml = Html.fromHtml(yourHtml).toString();
If you want to get a match you can try this:
<(\w+).*?class="wp-caption-text".*?>[\s\S]*?<\/\1>
Regex101
I'd like to mention that this is not a perfect solution. Regular expressions are not very good at parsing html since the structures in that markup language are actually too complex to 100% be parseable by regular expressions. See here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With