For example, if I have the following 2 arrays:
string[] userSelect = new string[] {"the", "quick", "brown", "dog", "jumps", "over"};
string[] original = new string[] {"the", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog"};
I'm trying to compare the userSelect array against the original array and get all consecutive matches based on index. The userSelect array will always be made up of strings from the original array. So the output would be like the following:
int[] match0 = new int[] {0, 1, 2}; // indices for "the quick brown"
int[] match2 = new int[] {4, 5}; // indices for "jumps over"
int[] match1 = new int[] {3}; // index for "dog"
The userSelect array length will never exceed the original array length, however it can be shorter and the words can be in any order. How would I go about doing this?
Here's what I came up with
var matches =
(from l in userSelect.Select((s, i) => new { s, i })
join r in original.Select((s, i) => new { s, i })
on l.s equals r.s
group l by r.i - l.i into g
from m in g.Select((l, j) => new { l.i, j = l.i - j, k = g.Key })
group m by new { m.j, m.k } into h
select h.Select(t => t.i).ToArray())
.ToArray();
This will output
matches[0] // { 0, 1, 2 } the quick brown
matches[1] // { 4, 5 } jumps over
matches[2] // { 0 } the
matches[3] // { 3 } dog
Using the input {"the", "quick", "brown", "the", "lazy", "dog"}
yields:
matches[0] // { 0, 1, 2 } the quick brown
matches[1] // { 0 } the
matches[2] // { 3 } the
matches[3] // { 3, 4, 5 } the lazy dog
Note that the calls to ToArray
are optional. If you don't actually need the results in an array you can leave that out and save a little processing time.
To filter out any sequences that are completely contained with other larger sequences, you can run this code (note the orderby
in the modified query):
var matches =
(from l in userSelect.Select((s, i) => new { s, i })
join r in original.Select((s, i) => new { s, i })
on l.s equals r.s
group l by r.i - l.i into g
from m in g.Select((l, j) => new { l.i, j = l.i - j, k = g.Key })
group m by new { m.j, m.k } into h
orderby h.Count() descending
select h.Select(t => t.i).ToArray());
int take = 0;
var filtered = matches.Where(m => !matches.Take(take++)
.Any(n => m.All(i => n.Contains(i))))
.ToArray();
This would be easier if words couldn't be repeated . . .
The general idea is to create a Dictionary<string, List<int>>
from the original word list. That will tell you which words are used at what positions. The dictionary for your sample would be:
key="the", value={0, 6}
key="quick", value={1}
key="brown", value={2}
... etc
Now, when you get the user's input, you step through it sequentially, looking up the words in your dictionary to get the list of positions.
So you look up a word and it's in the dictionary. You save the position(s) returned from the dictionary. Look up the next word. There are three conditions you need to handle:
I hope you get the idea.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With