I am looking for some efficient way (in .NET), how to find if there is a sequence of bytes in some list of bytes and if there is any, index where the first starts. For example let's say I have: <pre class="prettyprint"><code>var sequence = new List<byte> { 5, 10, 2 }; var listOne = new List<byte> { 1, 3, 10, 5, 10, 2, 8, 9 }; var listTwo = new List<byte> { 1, 3, 10, 5, 2, 10, 8, 9 }; </code></pre> and the result should be that my sequence is on index 3 in the listOne and on index -1 (ie. it is not there) in the listTwo. Of course I can loop through the list int by int and from every index and search if following numbers matches my sequence, but is there some more efficient way (for example using extension methods)?

This is essentially the same problem as substring searching (indeed, a list where order is significant is a generalisation of "string"). Luckily computer science has considered this problem frequently for a long time, so you get to stand on the shoulders of giants. Take a look at the literature. Some reasonable starting points are: http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm http://en.wikipedia.org/wiki/Rabin-karp Even just the pseudocode in the wikipedia articles is enough to port to C# quite easily. Look at the descriptions of performance in different cases and decide which cases are most likely to be encountered by your code. (I'm thinking the first from what you say about the search-key list being short).

I think the cleanest way is create a generic extension method like this: <pre class="prettyprint"><code>public static int SubListIndex<T>(this IList<T> list, int start, IList<T> sublist) { for (int listIndex = start; listIndex < list.Count - sublist.Count + 1; listIndex++) { int count = 0; while (count < sublist.Count && sublist[count].Equals(list[listIndex + count])) count++; if (count == sublist.Count) return listIndex; } return -1; } </code></pre> to call in this way: <pre class="prettyprint"><code>var indexOne = listOne.SubListIndex(0, sequence); var indexTwo = listTwo.SubListIndex(0, sequence); </code></pre> P.S. you can also start from a given index, if you need to search for more sublists occurrences

How to find index of sublist in list?

Tags:

c#

.net

I am looking for some efficient way (in .NET), how to find if there is a sequence of bytes in some list of bytes and if there is any, index where the first starts.

For example let's say I have:

var sequence = new List<byte> { 5, 10, 2 };
var listOne = new List<byte> { 1, 3, 10, 5, 10, 2, 8, 9 };
var listTwo = new List<byte> { 1, 3, 10, 5, 2, 10, 8, 9 };

and the result should be that my sequence is on index 3 in the listOne and on index -1 (ie. it is not there) in the listTwo.

Of course I can loop through the list int by int and from every index and search if following numbers matches my sequence, but is there some more efficient way (for example using extension methods)?

922

asked Aug 20 '10 09:08

Lukáš Rubeš

2 Answers

This is essentially the same problem as substring searching (indeed, a list where order is significant is a generalisation of "string").

Luckily computer science has considered this problem frequently for a long time, so you get to stand on the shoulders of giants.

Take a look at the literature. Some reasonable starting points are:

http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

http://en.wikipedia.org/wiki/Rabin-karp

Even just the pseudocode in the wikipedia articles is enough to port to C# quite easily. Look at the descriptions of performance in different cases and decide which cases are most likely to be encountered by your code. (I'm thinking the first from what you say about the search-key list being short).

141

answered Nov 13 '22 21:11

Jon Hanna

I think the cleanest way is create a generic extension method like this:

public static int SubListIndex<T>(this IList<T> list, int start, IList<T> sublist)
{
    for (int listIndex = start; listIndex < list.Count - sublist.Count + 1; listIndex++)
    {
        int count = 0;
        while (count < sublist.Count && sublist[count].Equals(list[listIndex + count]))
            count++;
        if (count == sublist.Count)
            return listIndex;
    }
    return -1;
}

to call in this way:

var indexOne = listOne.SubListIndex(0, sequence);
var indexTwo = listTwo.SubListIndex(0, sequence);

P.S. you can also start from a given index, if you need to search for more sublists occurrences

answered Nov 13 '22 20:11

digEmAll

Related questions
                            
                                Looking for a good exercise to help me get better at Multithreading
                            
                                how to get dependencies injected in constructors in Windows Forms
                            
                                How to increase a decimal's smallest fractional part by one?
                            
                                Can we implement Transaction on C# objects?
                            
                                C# / Silverlight / WPF / Fast rendering lots of circles
                            
                                Can A Constructor Return a SubClass?
                            
                                Is there equivalent to \Q ... \E in C# Regex
                            
                                How to use a Subclassed Control on an ASP.NET Page?
                            
                                WPF FlowDocument - Absolute Character Position
                            
                                DynamicObject and WCF support
                            
                                Converting F# Quotations into LINQ Expressions
                            
                                C# SQL Data Adapter System.Data.StrongTypingException
                            
                                MVVM Light Messenger Class
                            
                                What is the relevance of *.resx file in Windows Forms/controls? [duplicate]
                            
                                c# dynamically rename file upon download request
                            
                                Why would BufferedStream.Write throw "This stream does not support seek operations"?
                            
                                Many-To-Many Relationship in Code-First EF4
                            
                                Is there a good way to avoid unused method parameter in some of the subclasses while applying strategy pattern?
                            
                                Using "using" statement to dispose
                            
                                Change Font Color of a Column in a DataGridView Control (C# winforms)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With