Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search for an Array or List in a List

Tags:

c#

.net

list

linq

Have

List<byte> lbyte 

Have

byte[] searchBytes

How can I search lbyte for not just a single byte but for the index of the searchBytes?
E.G.

Int32 index = lbyte.FirstIndexOf(searchBytes);

Here is the brute force I came up with.
Not the performance I am looking for.

public static Int32 ListIndexOfArray(List<byte> lb, byte[] sbs)
{
    if (sbs == null) return -1;
    if (sbs.Length == 0) return -1;
    if (sbs.Length > 8) return -1;
    if (sbs.Length == 1) return lb.FirstOrDefault(x => x == sbs[0]);
    Int32 sbsLen = sbs.Length;
    Int32 sbsCurMatch = 0;
    for (int i = 0; i < lb.Count; i++)
    {
        if (lb[i] == sbs[sbsCurMatch])
        {
            sbsCurMatch++;
            if (sbsCurMatch == sbsLen)
            {
                //int index = lb.FindIndex(e => sbs.All(f => f.Equals(e)));  // fails to find a match
                IndexOfArray = i - sbsLen + 1;
                return;
            }
        }
        else 
        {
            sbsCurMatch = 0;
        }
    }
    return -1;
}
like image 591
paparazzo Avatar asked Apr 20 '13 00:04

paparazzo


2 Answers

Brute force is always an option. Although slow in comparison to some other methods, in practice it's usually not too bad. It's easy to implement and quite acceptable if lbyte isn't huge and doesn't have pathological data.

It's the same concept as brute force string searching.

like image 151
Jim Mischel Avatar answered Sep 28 '22 19:09

Jim Mischel


You may find Boyer-Moore algorithm useful here. Convert your list to an array and search. The algorithm code is taken from this post.

static int SimpleBoyerMooreSearch(byte[] haystack, byte[] needle)
{
    int[] lookup = new int[256];
    for (int i = 0; i < lookup.Length; i++) { lookup[i] = needle.Length; }

    for (int i = 0; i < needle.Length; i++)
    {
        lookup[needle[i]] = needle.Length - i - 1;
    }

    int index = needle.Length - 1;
    var lastByte = needle.Last();
    while (index < haystack.Length)
    {
        var checkByte = haystack[index];
        if (haystack[index] == lastByte)
        {
            bool found = true;
            for (int j = needle.Length - 2; j >= 0; j--)
            {
                if (haystack[index - needle.Length + j + 1] != needle[j])
                {
                    found = false;
                    break;
                }
            }

            if (found)
                return index - needle.Length + 1;
            else
                index++;
        }
        else
        {
            index += lookup[checkByte];
        }
    }
    return -1;
}

You can then search like this. If lbyte will remain constant after a certain time, you can just convert it to an array once and pass that.

//index is returned, or -1 if 'searchBytes' is not found
int startIndex = SimpleBoyerMooreSearch(lbyte.ToArray(), searchBytes);

Update based on comment. Here's the IList implementation which means that arrays and lists (and anything else that implements IList can be passed)

 static int SimpleBoyerMooreSearch(IList<byte> haystack, IList<byte> needle)
 {
    int[] lookup = new int[256];
    for (int i = 0; i < lookup.Length; i++) { lookup[i] = needle.Count; }

    for (int i = 0; i < needle.Count; i++)
    {
        lookup[needle[i]] = needle.Count - i - 1;
    }

    int index = needle.Count - 1;
    var lastByte = needle[index];
    while (index < haystack.Count)
    {
        var checkByte = haystack[index];
        if (haystack[index] == lastByte)
        {
            bool found = true;
            for (int j = needle.Count - 2; j >= 0; j--)
            {
                if (haystack[index - needle.Count + j + 1] != needle[j])
                {
                    found = false;
                    break;
                }
            }

            if (found)
                return index - needle.Count + 1;
            else
                index++;
        }
        else
        {
            index += lookup[checkByte];
        }
    }
    return -1;
}

Since arrays and lists implement IList, there's no conversion necessary when calling it in your case.

int startIndex = SimpleBoyerMooreSearch(lbyte, searchBytes);
like image 42
keyboardP Avatar answered Sep 28 '22 17:09

keyboardP