Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to split string into lines

People also ask

How do you split a string into a line in Python?

To split the line in Python, use the String split() method. The split() is an inbuilt method that returns a list of lines after breaking the given string by the specified separator. In this tutorial, the line is equal to the string because there is no concept of a line in Python. So you can think of a line as a string.

How do you split a line break?

To split a string by newline, call the split() method passing it the following regular expression as parameter - /\r?\ n/ . The split method will split the string on each occurrence of a newline character and return an array containing the substrings. Copied!


  • If it looks ugly, just remove the unnecessary ToCharArray call.

  • If you want to split by either \n or \r, you've got two options:

    • Use an array literal – but this will give you empty lines for Windows-style line endings \r\n:

      var result = text.Split(new [] { '\r', '\n' });
      
    • Use a regular expression, as indicated by Bart:

      var result = Regex.Split(text, "\r\n|\r|\n");
      
  • If you want to preserve empty lines, why do you explicitly tell C# to throw them away? (StringSplitOptions parameter) – use StringSplitOptions.None instead.


using (StringReader sr = new StringReader(text)) {
    string line;
    while ((line = sr.ReadLine()) != null) {
        // do something
    }
}

Update: See here for an alternative/async solution.


This works great and is faster than Regex:

input.Split(new[] {"\r\n", "\r", "\n"}, StringSplitOptions.None)

It is important to have "\r\n" first in the array so that it's taken as one line break. The above gives the same results as either of these Regex solutions:

Regex.Split(input, "\r\n|\r|\n")

Regex.Split(input, "\r?\n|\r")

Except that Regex turns out to be about 10 times slower. Here's my test:

Action<Action> measure = (Action func) => {
    var start = DateTime.Now;
    for (int i = 0; i < 100000; i++) {
        func();
    }
    var duration = DateTime.Now - start;
    Console.WriteLine(duration);
};

var input = "";
for (int i = 0; i < 100; i++)
{
    input += "1 \r2\r\n3\n4\n\r5 \r\n\r\n 6\r7\r 8\r\n";
}

measure(() =>
    input.Split(new[] {"\r\n", "\r", "\n"}, StringSplitOptions.None)
);

measure(() =>
    Regex.Split(input, "\r\n|\r|\n")
);

measure(() =>
    Regex.Split(input, "\r?\n|\r")
);

Output:

00:00:03.8527616

00:00:31.8017726

00:00:32.5557128

and here's the Extension Method:

public static class StringExtensionMethods
{
    public static IEnumerable<string> GetLines(this string str, bool removeEmptyLines = false)
    {
        return str.Split(new[] { "\r\n", "\r", "\n" },
            removeEmptyLines ? StringSplitOptions.RemoveEmptyEntries : StringSplitOptions.None);
    }
}

Usage:

input.GetLines()      // keeps empty lines

input.GetLines(true)  // removes empty lines

You could use Regex.Split:

string[] tokens = Regex.Split(input, @"\r?\n|\r");

Edit: added |\r to account for (older) Mac line terminators.


If you want to keep empty lines just remove the StringSplitOptions.

var result = input.Split(System.Environment.NewLine.ToCharArray());