Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to perform a lot of strings replace in Java

Tags:

java

string

regex

I have to write some sort of parser that get a String and replace certain sets of character with others. The code looks like this:

noHTMLString = noHTMLString.replaceAll("</p>", "\n");
noHTMLString = noHTMLString.replaceAll("<br/>", "\n\n");
noHTMLString = noHTMLString.replaceAll("<br />", "\n\n");
//here goes A LOT of lines like these ones

The function is very long and performs a lot of strings replaces. The issue here is that it takes a lot of time because the method it's called a lot of times, slowing down the application performance.

I have read some threads here about using StringBuilder as an alternative but it lacks the ReplaceAll method and as it's noted here Does string.replaceAll() performance suffer from string immutability? the replaceAll method in String class works with

Match Pattern & Matcher and Matcher.replaceAll() uses a StringBuilder to store the eventually returned value so I don't know if switching to StringBuilder will really reduce the time to perform the substitutions.

Do you know a fast way to do a lot of String replace in a fast way? Do you have any advice for this problem?

Thanks.

EDIT: I have to create a report that have a few fields with html text. For each row I'm calling the method that replaces all the html tags and special characters inside these strings. With a full report it takes more than 3 minutes to parse all the text. The problem is that I have to invoke the method very often

like image 268
Averroes Avatar asked Nov 26 '10 11:11

Averroes


1 Answers

I found that org.apache.commons.lang.StringUtils is the fastest if you don't want to bother with the StringBuffer.

You can use it like this:
noHTMLString = StringUtils.replace(noHTMLString, "</p>", "\n");

I did performance testing it was fester than my custom StrinBuffer solution similar to the one @extraneon proposed.

like image 70
MatBanik Avatar answered Sep 30 '22 03:09

MatBanik