Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# split string but keep separators

Tags:

string

c#

There already exist similar questions, but all of them use regexen. The code I'm using (that strips the separators):

string[] sentences = s.Split(new string[] { ". ", "? ", "! ", "... " }, StringSplitOptions.None);

I would like to split a block of text on sentence breaks and keep the sentence terminators. I'd like to avoid using regexen for performance. Is it possible?

like image 242
Isaac G. Avatar asked Nov 20 '25 22:11

Isaac G.


1 Answers

I don't believe there is an existing function that does this. However you can use the following extension method.

public static IEnumerable<string> SplitAndKeepSeparators(this string source, string[] separators) {
  var builder = new Text.StringBuilder();
  foreach (var cur in source) {
    builder.Append(cur);
    if (separators.Contains(cur)) {
      yield return builder.ToString();
      builder.Length = 0;
    }
  }
  if (builder.Length > 0) {
    yield return builder.ToString();
  }
}
like image 161
JaredPar Avatar answered Nov 23 '25 10:11

JaredPar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!