Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to split string into lines with maximum length, without breaking words

I want to break a string up into lines of a specified maximum length, without splitting any words, if possible (if there is a word that exceeds the maximum line length, then it will have to be split).

As always, I am acutely aware that strings are immutable and that one should preferably use the StringBuilder class. I have seen examples where the string is split into words and the lines are then built up using the StringBuilder class, but the code below seems "neater" to me.

I mentioned "best" in the description and not "most efficient" as I am also interested in the "eloquence" of the code. The strings will never be huge, generally splitting into 2 or three lines, and it won't be happening for thousands of lines.

Is the following code really bad?

private static IEnumerable<string> SplitToLines(string stringToSplit, int maximumLineLength)
{
    stringToSplit = stringToSplit.Trim();
    var lines = new List<string>();

    while (stringToSplit.Length > 0)
    {
        if (stringToSplit.Length <= maximumLineLength)
        {
            lines.Add(stringToSplit);
            break;
        }

        var indexOfLastSpaceInLine = stringToSplit.Substring(0, maximumLineLength).LastIndexOf(' ');
        lines.Add(stringToSplit.Substring(0, indexOfLastSpaceInLine >= 0 ? indexOfLastSpaceInLine : maximumLineLength).Trim());
        stringToSplit = stringToSplit.Substring(indexOfLastSpaceInLine >= 0 ? indexOfLastSpaceInLine + 1 : maximumLineLength);
    }

    return lines.ToArray();
}
like image 646
Mark Farr Avatar asked Mar 13 '14 03:03

Mark Farr


People also ask

How do you split a long string?

Use triple quotes to create a multiline string It is the simplest method to let a long string split into different lines. You will need to enclose it with a pair of Triple quotes, one at the start and second in the end. Anything inside the enclosing Triple quotes will become part of one multiline string.

How do you split a string by line?

Split String at Newline Then use splitlines to split the string at the newline character. Create a string in which two lines of text are separated by \n . You can use + to concatenate text onto the end of a string. Convert \n into an actual newline character.

How do you split a string into characters?

(which means "any character" in regex), use either backslash \ to escape the individual special character like so split("\\.") , or use character class [] to represent literal character(s) like so split("[.]") , or use Pattern#quote() to escape the entire string like so split(Pattern.


2 Answers

Even when this post is 3 years old I wanted to give a better solution using Regex to accomplish the same:

If you want the string to be splitted and then use the text to be displayed you can use this:

public string SplitToLines(string stringToSplit, int maximumLineLength)
{
    return Regex.Replace(stringToSplit, @"(.{1," + maximumLineLength +@"})(?:\s|$)", "$1\n");
}

If on the other hand you need a collection you can use this:

public MatchCollection SplitToLines(string stringToSplit, int maximumLineLength)
{
    return Regex.Matches(stringToSplit, @"(.{1," + maximumLineLength +@"})(?:\s|$)");
}

NOTES

Remember to import regex (using System.Text.RegularExpressions;)

You can use string interpolation on the match:
$@"(.{{1,{maximumLineLength}}})(?:\s|$)"

The MatchCollection works almost like an Array

Matching example with explanation here

like image 197
3vts Avatar answered Sep 18 '22 22:09

3vts


How about this as a solution:

IEnumerable<string> SplitToLines(string stringToSplit, int maximumLineLength)
{
    var words = stringToSplit.Split(' ').Concat(new [] { "" });
    return
        words
            .Skip(1)
            .Aggregate(
                words.Take(1).ToList(),
                (a, w) =>
                {
                    var last = a.Last();
                    while (last.Length > maximumLineLength)
                    {
                        a[a.Count() - 1] = last.Substring(0, maximumLineLength);
                        last = last.Substring(maximumLineLength);
                        a.Add(last);
                    }
                    var test = last + " " + w;
                    if (test.Length > maximumLineLength)
                    {
                        a.Add(w);
                    }
                    else
                    {
                        a[a.Count() - 1] = test;
                    }
                    return a;
                });
}

I reworked this as prefer this:

IEnumerable<string> SplitToLines(string stringToSplit, int maximumLineLength)
{
    var words = stringToSplit.Split(' ');
    var line = words.First();
    foreach (var word in words.Skip(1))
    {
        var test = $"{line} {word}";
        if (test.Length > maximumLineLength)
        {
            yield return line;
            line = word;
        }
        else
        {
            line = test;
        }
    }
    yield return line;
}
like image 40
Enigmativity Avatar answered Sep 17 '22 22:09

Enigmativity