I need help to develop a logic to split a string, but only based on the last 2 delimiters of the string.
Example inputs:
string s1 = "Dog \ Cat \ Bird \ Cow";
string s2 = "Hello \ World \ How \ Are \ You";
string s3 = "I \ am \ Peter";
Expected Outputs:
string[] newS1 = "Dog Cat", "Bird", "Cow"
string[] newS2 = "Hello World How", "Are", "You"
string[] newS3 = "I", "am", "Peter"
So, as you can see, I only want to split the string on the last 2 "\", and everything else before the last 2 "\" will be concatenated into one string.
I tried the .Split method but it will just split every "\" in a string.
Edited: If the string has less than 2 "\", it will just split according to whatever it has
Updates: Wow, these are a bunch of interesting solutions! Thank you a lot!
Try this:
var parts = s1.Split(new[] { " \\ " }, StringSplitOptions.None);
var partsCount = parts.Count();
var result = new[] { string.Join(" ", parts.Take(partsCount - 2)) }.Concat(parts.Skip(partsCount - 2));
Offering a regex solution:
var output = Regex.Split(input, @"\s*\\\s*([^\\]*?)\s*\\\s*(?=[^\\]*$)");
This split finds the second to last element and splits around that, but captures it in a group so it will be included in the output array.
For input "Dog \ Cat \ Bird \ Cow", this will produce { "Dog \ Cat", "Bird", "Cow" }. If you also need to strip the \ out of the first element that can be done with a simple replace:
output[0] = output[0].Replace(" \\", "");
Update: This version will correctly handle strings with only one delimiter:
var output = Regex.Split(str, @"\s*\\\s*([^\\]*?)\s*\\\s*(?=[^\\]*$)|(?<=^[^\\\s]*)\s*\\\s*(?=[^\\\s]*$)");
Update: And to match other delimiters like whitespace, "~", and "%", you can use a character class:
var output = Regex.Split(str, @"(?:[%~\s\\]+([^%~\s\\]+?)[%~\s\\]+|(?<=^[^%~\s\\]+)[%~\s\\]+)(?=[^%~\s\\]+$)");
The structure of this regex is slightly simpler than the previous one since it represents any sequence of one or more characters in the class [%~\s\\] as a delimiter, and any sequence of one or more characters in the negated character class [^%~\s\\] to be a segment. Note that the \s means 'whitespace' character.
You might also be able to simplify this further using:
var output = Regex.Split(str, @"(?:\W+(\w+)\W+|(?<=^\w+)\W+)(?=\w+$)");
Where \w matches any 'word' character (letters, digits, or underscores) and \W matches any 'non-word' character.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With