Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string base on the last N numbers of delimiters

Tags:

c#

I need help to develop a logic to split a string, but only based on the last 2 delimiters of the string.

Example inputs:

string s1 = "Dog \ Cat \ Bird \ Cow";

string s2 = "Hello \ World \ How \ Are \ You";

string s3 = "I \ am \ Peter";

Expected Outputs:

string[] newS1 = "Dog Cat", "Bird", "Cow"
string[] newS2 = "Hello World How", "Are", "You"
string[] newS3 = "I", "am", "Peter"

So, as you can see, I only want to split the string on the last 2 "\", and everything else before the last 2 "\" will be concatenated into one string.

I tried the .Split method but it will just split every "\" in a string.

Edited: If the string has less than 2 "\", it will just split according to whatever it has

Updates: Wow, these are a bunch of interesting solutions! Thank you a lot!

like image 443
C.J. Avatar asked Jan 23 '26 22:01

C.J.


2 Answers

Try this:

var parts = s1.Split(new[] { " \\ " }, StringSplitOptions.None);
var partsCount = parts.Count();
var result = new[] { string.Join(" ", parts.Take(partsCount - 2)) }.Concat(parts.Skip(partsCount - 2));
like image 155
John Gibb Avatar answered Jan 25 '26 13:01

John Gibb


Offering a regex solution:

var output = Regex.Split(input, @"\s*\\\s*([^\\]*?)\s*\\\s*(?=[^\\]*$)");

This split finds the second to last element and splits around that, but captures it in a group so it will be included in the output array.

For input "Dog \ Cat \ Bird \ Cow", this will produce { "Dog \ Cat", "Bird", "Cow" }. If you also need to strip the \ out of the first element that can be done with a simple replace:

output[0] = output[0].Replace(" \\", "");

Update: This version will correctly handle strings with only one delimiter:

var output = Regex.Split(str, @"\s*\\\s*([^\\]*?)\s*\\\s*(?=[^\\]*$)|(?<=^[^\\\s]*)\s*\\\s*(?=[^\\\s]*$)");

Update: And to match other delimiters like whitespace, "~", and "%", you can use a character class:

var output = Regex.Split(str, @"(?:[%~\s\\]+([^%~\s\\]+?)[%~\s\\]+|(?<=^[^%~\s\\]+)[%~\s\\]+)(?=[^%~\s\\]+$)");

The structure of this regex is slightly simpler than the previous one since it represents any sequence of one or more characters in the class [%~\s\\] as a delimiter, and any sequence of one or more characters in the negated character class [^%~\s\\] to be a segment. Note that the \s means 'whitespace' character.

You might also be able to simplify this further using:

var output = Regex.Split(str, @"(?:\W+(\w+)\W+|(?<=^\w+)\W+)(?=\w+$)");

Where \w matches any 'word' character (letters, digits, or underscores) and \W matches any 'non-word' character.

like image 30
p.s.w.g Avatar answered Jan 25 '26 13:01

p.s.w.g



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!