Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ Query to identify fragments in a series

Tags:

c#

linq

Suppose I have the following array (my sequences are all sorted in ascending order, and contain positive integers)

var mySequence = new [] {1, 2, 3, 7, 8, 9, 15, 16, 17};

I want to write a linq query to select the continuous numbers in a series treated as a group. So, in above example I would get { [1, 2, 3], [7, 8, 9], [15, 16, 17] }.

I could write a foreach() sequence, run through each element and see where the sequence is taking a hop and yield a group there. But is there any LINQ-only way of doing it? I might be able to move my foreach() code into a new extension method so my code still looks LINQy, but I am wondering if there is anything available in System.Linq already for that.

EDIT: I had created my own extension (as follows), but Me.Name came up with something very smart in his Answer.

internal class Sequence
{
    public int Start { get; set; }
    public int End { get; set; }
}

internal static class EnumerableMixins
{
    public static IEnumerable<Sequence> GroupFragments(this IEnumerable<int> sequence)
    {
        if (sequence.Any())
        {
            var lastNumber = sequence.First();
            var firstNumber = lastNumber;

            foreach(var number in sequence.Skip(1))
            {
                if (Math.Abs(number - lastNumber) > 1)
                {
                    yield return new Sequence() { Start = firstNumber, End = lastNumber };

                    firstNumber = lastNumber = number;
                }
                else
                {
                    lastNumber = number;
                }
            }


            yield return new Sequence() { Start = firstNumber, End = lastNumber };
        }
    }
}
like image 658
fahadash Avatar asked Feb 22 '16 21:02

fahadash


People also ask

Can we use LINQ to query against a DataTable?

Can we use linq to query against a DataTable? Explanation: We cannot use query against the DataTable's Rows collection, since DataRowCollection doesn't implement IEnumerable<T>. We need to use the AsEnumerable() extension for DataTable.

What is AsEnumerable in C#?

AsEnumerable() in C# It is an extension method. The following is our array − int[] arr = new int[5]; arr[0] = 10; arr[1] = 20; arr[2] = 30; arr[3] = 40; arr[4] = 50; Now, get the IEnumerable equivalent.

Is LINQ multithreaded?

By default, only one thread is used to execute a LINQ query. Parallel LINQ (PLINQ) is an easy way to enable multiple threads to execute a query. To see it in action, we will start with some code that only uses a single thread to double 200 million integers.

How can you load data into a DataSet so that it can be queried using LINQ?

Data sources that implement the IEnumerable<T> generic interface can be queried through LINQ. Calling AsEnumerable on a DataTable returns an object which implements the generic IEnumerable<T> interface, which serves as the data source for LINQ to DataSet queries.


2 Answers

An old trick to find these kinds of islands is to subtract the index and the numeric value. The result will represent unique groups. Using the select overload including the index:

var mySequence = new [] {1, 2, 3, 7, 8, 9, 15, 16, 17};

var groups = mySequence
    .Select((val,ind) => new{val, group = val - ind})
    .GroupBy(v=>v.group, v=>v.val)
    .Select(v=> v.ToList()).ToList();

(Used ToList here, but of course ToArray can be used if arrays are preferred)

like image 77
Me.Name Avatar answered Sep 28 '22 08:09

Me.Name


I little late but anyway I want to share the idea. You could also create an iterator like this:

public static class Extensions
{
    public static IEnumerable<IEnumerable<int>> ContinuousNumbers(this IEnumerable<int> source)
    {
        using (var e = source.GetEnumerator())
        {
            if (e.MoveNext())
            {
                var list = new List<int> { e.Current};
                int temp = e.Current;
                while (e.MoveNext())
                {
                    if (e.Current == temp+1)
                    {
                        list.Add(e.Current);
                        temp++;
                    }
                    else
                    {
                        yield return list;
                        list = new List<int> { e.Current};
                        temp = e.Current;
                    }
                }

                if (list.Count > 0)
                {
                    yield return list;
                }
            }
        }
    }
}

Another variant could be using the Aggregate extension method:

var result = mySequence.Aggregate(new List<List<int>>(),
                                   (r, current) =>
                                   {
                                       if ( r.Count==0  ||  (r.Last().Count>0 && r.Last().Last() != current-1))
                                           r.Add(new List<int> { current});
                                       else
                                           r.Last().Add(current);
                                       return r;
                                   });
like image 36
octavioccl Avatar answered Sep 28 '22 07:09

octavioccl