Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Text split after a specified length but dont break words using grails

Tags:

java

regex

groovy

I have a long string that I need to parse into an array of strings that do not exceed 50 characters in length. The tricky part of this for me is making sure that the regex finds the last whitespace before 50 characters to make a clean break between strings since I don't want words cut off.

public List<String> splitInfoText(String msg) { 
     int MAX_WIDTH = 50; 
     def line = [] String[] words; 
     msg = msg.trim(); 
     words = msg.split(" "); 
     StringBuffer s = new StringBuffer(); 
     words.each {
        word -> s.append(word + " "); 
        if (s.length() > MAX_WIDTH) { 
          s.replace(s.length() - word.length()-1, s.length(), " "); 
          line << s.toString().trim();
          s = new StringBuffer(word + " "); 
        } 
     } 
     if (s.length() > 0) 
        line << s.toString().trim();
     return line; 
}
like image 821
Nimmy Avatar asked Mar 12 '12 04:03

Nimmy


2 Answers

Try this:

List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile(".{1,50}(?:\\s|$)", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group());
}
like image 50
Tim Pietzcker Avatar answered Sep 20 '22 10:09

Tim Pietzcker


I believe a Groovier version of Tim's answer is:

List matchList = ( subjectString =~ /(?s)(.{1,50})(?:\s|$)/ ).collect { it[ 1 ] }
like image 29
tim_yates Avatar answered Sep 18 '22 10:09

tim_yates