Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace multiple spaces and newlines with one blank line

How to remove multiple spaces and newlines in a string, but preserve at least one blank line for each group of blank lines.

For example, change:

"This      is



a        string.




Something."

to

"This is

a string.

Something."

I'm using .trim() to strip whitespace from the beginning and end of a string, but I couldn't find anything for removing multiple spaces and newlines in a string.

I would like to keep just one whitespace and one newline.

like image 534
user3051755 Avatar asked Jan 16 '14 15:01

user3051755


3 Answers

The one-line solution to remove multiple spaces/newlines, but preserve at least one blank line from multiple blank lines:

str = str.replaceAll("(?m)(^ *| +(?= |$))", "").replaceAll("(?m)^$([\r\n]+?)(^$[\r\n]+?^)+", "$1");

Each individual line is trimmed too.


Here's some test code:

String str = "   This       is\r\n    " + 
        "\r\n" + 
        "   \r\n   " + 
        " \r    \n   \n  " +
        "\r\n" + 
        "                a        string.   ";
str = str.trim().replaceAll("(?m)(^ *| +(?= |$))", "").replaceAll("(?m)^$([\r\n]+?)(^$[\r\n]+?^)+", "$1");
System.out.println(str);

Output:

This is

a string.
like image 189
Bohemian Avatar answered Sep 29 '22 01:09

Bohemian


The previous advice will trim all whitespace, including the linefeeds and replace them with a single space.

 text.replaceAll("\\n\\s*\\n", "\\n").replaceAll("[ \\t\\x0B\\f]+", " ").trim());

First it replaces any instances of linefeeds with only whitespace between them with a single linefeed, then it trims down any other whitespace to a single space ignoring linefeeds.

like image 28
Tim B Avatar answered Sep 29 '22 01:09

Tim B


Here is what I came up with after a bit of testing...

public String keepOneWS(String str) {
    Pattern p = Pattern.compile("(\\s+)");
    Matcher m = p.matcher(str);

    Pattern pBlank = Pattern.compile("[ \t]+");
    String newLineReplacement = System.getProperty("line.separator") + 
            System.getProperty("line.separator");

    StringBuffer sb = new StringBuffer();
    while (m.find()) {
        if(pBlank.matcher(m.group(1)).matches()) {
            m.appendReplacement(sb, " ");   
        } else {
            m.appendReplacement(sb, newLineReplacement);
        }
    }
    m.appendTail(sb);

    return sb.toString().trim();
}

public void testKeepOneWS()  {
    String str = "   This   \t    is\r\n    " + 
            "\r\n" + 
            "   \r\n   " + 
            " \r    \n  \t  \n  " +
            "\r\n" + 
            "                a   \t     string.   \t ";
    String expected = "This is" + System.getProperty("line.separator")+ 
            System.getProperty("line.separator") + "a string.";
    String actual = keepOneWS(str);
    System.out.println("'" + actual + "'");
    assertEquals(expected, actual);
}

After a goup of whitespace is captured, it is checked whether it consists only of spaces, if yes then that goup is replaced by one single space, otherwise the goup consits of spaces and line terminators, in this case the group is replaced by one line terminator.

The output is:

'This is

a string.' 
like image 26
A4L Avatar answered Sep 29 '22 00:09

A4L