Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an elegant LINQ solution for SomeButNotAll()?

Here is what I'm trying to do overall. Just to be clear, this isn't homework or for a contest or anything. Hopefully, I've made the wording clear enough:

Problem

Given a set of strings in the same format, but where some end in a lowercase letter and some do not, return a set of one of each string that does not end in a lowercase letter, but that has at least one identical string ending in a lowercase letter.

Example

To keep it simple, let's say the string format is \d+[a-z]?, where the common part is the number. Given {1, 4, 3a, 1b, 3, 6c}, I should receive a permutation of {1, 3} because 1 and 3 have both an element with and without a lowercase letter at the end.

Alternate Solution

You can view this solution here.

One way I thought of doing this was to partition the set into elements with and without a lowercase letter suffix ({1, 4, 3} and {3a, 1b, 6c}), and then return withoutSuffix.Where(x => withSuffix.Any(y => y.StartsWith(x))).

I have two problems with this:

  1. I don't see a good way to partition into the two sets, with the predicate Regex.IsMatch(input, "[a-z]$"). The two I thought of were two similarly-defined variables each using a Where clause and doing the regex matching twice per element, or transforming the set to store the regex match results and then forming the two variables from that. group...by doesn't seem to play well when you need to access both sets like this, but I could be wrong there.

  2. Although the size is small enough not to care about performance, going through withSuffix once per withoutSuffix element seems inelegant.

Relevant Solution

You can view this solution here.

The other way that came to mind was to grab the common prefix and the optional suffix: {1 => {"", b}, 3 => {a, ""}, 4 => {""}, 6 => {c}}. This is easily accomplished by capturing the prefix and the suffix with a regex ((\d+)([a-z])?) and grouping the suffix by the prefix into grouped.

From here, it would be great to do:

where grouped.SomeButNotAll(x => x == string.Empty)
select grouped.Key

Or even:

where grouped.ContainsSomeButNotAll(string.Empty)
select grouped.Key

I could certainly create either of these, but unfortunately, the best I can see in LINQ is:

where grouped.Contains(string.Empty) && grouped.Any(x => x != string.Empty)
select grouped.Key

It just feels super redundant. Is there anything better than this in LINQ already?

P.S. I'm open to better approaches to solving the overall problem instead of making this an XY problem. Elegance is desired much more than performance, but (maybe it's just me) being plain wasteful still seems inelegant.

like image 359
chris Avatar asked Apr 24 '15 20:04

chris


1 Answers

I don't think you really need regexen for this. Here's how I would do it:

var withEndings = new HashSet<string>();
var withoutEndings = new HashSet<string>();

foreach (var s in input)
    if(char.IsLower(s[s.Length - 1])) 
        withEndings.Add(s.Substring(0, s.Length - 1));
    else
        withoutEndings.Add(s);

var result = withEndings.Intersect(withoutEndings);
like image 167
Asad Saeeduddin Avatar answered Sep 19 '22 14:09

Asad Saeeduddin