Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to merge two string array based on comparision

Tags:

c#

linq

Below is my class :

public class Regions
    {
        public int Id { get; set; }
        public string[] ParentName { get; set; }
    }

Now I have 2 list of above regions class like below containing some data:

var region1 = new Regions();
var region2 = new Regions();

Now ParentName contains data like below for region1 :

[0] : Abc.mp3,Pqr.mp3
[1] : Xxx.mp3
[2] : kkk.mp3
[3] : ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3

Now ParentName contains data like below for region2 :

[0] : Abc.mp3,Pqr.mp3,lmn.mp3
[1] : rrr.mp3,ggg.mp3,yyy.mp3

Now I am trying to merge ParentName of region2 in to region1 if any part of region1 is matching with region2 after splitting records by comma like below :

[0] : Abc.mp3,Pqr.mp3,lmn.mp3
[1] : Xxx.mp3
[2] : kkk.mp3
[3] : ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3,ggg.mp3,yyy.mp3

Now in above expected output, Abc.mp3 and Pqr.Mp3(Region1 and Region2) is matching only Lmn.mp3 is not matching so it will be appended at the end of Region1.

For the last record from region1 and region2, rrr.mp3 is matching(single match is also enough) so non matching record from region2 i.e ggg.mp3,yyy.mp3 will be appended at the end of region1.

Output I am getting in Region1:

[0] : Abc.mp3,Pqr.mp3
[1] : Xxx.mp3
[2] : kkk.mp3
[3] : ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3
[4] : Abc.mp3,Pqr.mp3,lmn.mp3
[3] : rrr.mp3,ggg.mp3,yyy.mp3

Code :

region1.ParentName = region1.ParentName.Concat(region2.ParentName).Distinct().ToArray();


public static T[] Concat<T>(this T[] x, T[] y)
        {
            if (x == null) throw new ArgumentNullException("x");
            if (y == null) throw new ArgumentNullException("y");
            int oldLen = x.Length;
            Array.Resize<T>(ref x, x.Length + y.Length);
            Array.Copy(y, 0, x, oldLen, y.Length);
            return x;
        }
like image 690
Pratik Soni Avatar asked May 28 '26 16:05

Pratik Soni


2 Answers

It's unclear if your names contain duplicates and how they should be handled, but here is the LINQ solution which produces the desired result with the specified inputs:

var e2Sets = region2.ParentName.Select(e2 => e2.Split(',')).ToList();
var result =
    from e1 in region1.ParentName
    let e1Set = e1.Split(',')
    let e2AppendSet = (
       from e2Set in e2Sets
       where e1Set.Intersect(e2Set).Any()
       from e2New in e2Set.Except(e1Set)
       select e2New
    ).Distinct()
    select string.Join(",", e1Set.Concat(e2AppendSet));

result.ToArray() will give you the desired new region1.ParentName.

How it works:

Since we basically need Cartesian product of the two input sequences, we start by preparing a list of the arrays of split strings of the second sequence, in order to avoid multiple string.Split inside the inner loop.

The for each element of the first sequence, we split it to array of strings, the for each split array in the second sequence which has a match (determined with Intersect method) we select the unmatched strings using the Except method. Then we flatten all the unmatched strings, apply Distinct to remove the potential duplicates, concatenate the two sets and use string.Join to produce the new comma delimited string.

like image 96
Ivan Stoev Avatar answered May 30 '26 05:05

Ivan Stoev


You could do the following:

public static void Merge(Regions first, Regions second)
{
    if (ReferenceEquals(first, null))
        throw new ArgumentNullException(nameof(first));

    if (ReferenceEquals(second, null))
        throw new ArgumentNullException(nameof(second));

    first.ParentName = first.ParentName.Merge(second.ParentName).ToArray();
}

private static IEnumerable<string> Merge(this IEnumerable<string> first, IEnumerable<string> second)
{
    if (ReferenceEquals(first, null))
        throw new ArgumentNullException(nameof(first));

    if (ReferenceEquals(second, null))
        throw new ArgumentNullException(nameof(second));

    foreach (var f in first)
    {
        yield return f.Merge(second, ',');
    }
}

private static string Merge(this string first, IEnumerable<string> second, char separator)
{
    Debug.Assert(first != null);
    Debug.Assert(second != null);

    var firstSplitted = first.Split(separator);

    foreach (var s in second)
    {
        var sSplitted = s.Split(separator);

        if (firstSplitted.Intersect(sSplitted).Any())
            return string.Join(separator.ToString(), firstSplitted.Union(sSplitted));
    }

    return first;
}

Note that this will merge on the first match it finds; if duplicate values exist, it will only merge the first time the match is encountered.

The secret here is divide and conquer. If you are having trouble implementing a certain logic, then break it down into simpler steps and implement a method for each baby step. Once its working, if you really need to, you can refactor your code to make it more concise or performant.

If you run this:

var first = new Regions();
var second = new Regions();
first.ParentName = new[] { "Abc.mp3,Pqr.mp3", "Xxx.mp3", "kkk.mp3", "ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3" };
second.ParentName = new[] { "Abc.mp3,Pqr.mp3,lmn.mp3", "rrr.mp3,ggg.mp3,yyy.mp3" };
Merge(first, second);

You will get the expected result. first.ParentName will be:

[0]: "Abc.mp3,Pqr.mp3,lmn.mp3"
[1]: "Xxx.mp3"
[2]: "kkk.mp3"
[3]: "ppp.mp3,zzz.mp3,rrr.mp3,ddd.mp3,ggg.mp3,yyy.mp3"
like image 35
InBetween Avatar answered May 30 '26 05:05

InBetween



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!