Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove style tag from a text file with regex

Tags:

java

regex

I need to remove style tags from text file..

I tried the following code

String text = readFile("E:/textwithstyletags.txt");
retVal = text.replaceAll("<style(.+?)</style>", "");

it works when the text file has style tags without new lines i.e. <style> body{ color:red; } </style>

It doesn't work when there are new lines, like this

<style> 
body{ 
color:red; 
} 
</style>
like image 787
Mark Timothy Avatar asked Apr 27 '15 06:04

Mark Timothy


3 Answers

You can use [\s\S] in place of . in your regex

i.e:

retVal = text.replaceAll("<style([\\s\\S]+?)</style>", "");
like image 91
karthik manchala Avatar answered Nov 15 '22 04:11

karthik manchala


Tested on regex101.

Pattern:

<style((.|\n|\r)*?)<\/style>    

Your code:

String text = readFile("E:/textwithstyletags.txt");
retVal = text.replaceAll("<style((.|\\n|\\r)*?)<\\/style>", "");
like image 36
Jared Rummler Avatar answered Nov 15 '22 05:11

Jared Rummler


Try this regex:

retVal  = text.replaceAll("(?i)<style.*?>.*?</style>", "");

On a side note you can look at JSoup which is a java library made for HTML manipulation.

like image 36
Rahul Tripathi Avatar answered Nov 15 '22 05:11

Rahul Tripathi