Please, now that I've re-written the question, and before it suffers from further fast-gun answers or premature closure by eager editors let me point out that this is not a duplicate of this question. I know how to remove duplicates from an array.
This question is about removing sequences from an array, not duplicates in the strict sense.
Consider this sequence of elements in an array;
[0] a
[1] a
[2] b
[3] c
[4] c
[5] a
[6] c
[7] d
[8] c
[9] d
In this example I want to obtain the following...
[0] a
[1] b
[2] c
[3] a
[4] c
[5] d
Notice that duplicate elements are retained but that sequences of the same element have been reduced to a single instance of that element.
Further, notice that when two lines repeat they should be reduced to one set (of two lines).
[0] c
[1] d
[2] c
[3] d
...reduces to...
[0] c
[1] d
I'm coding in C# but algorithms in any language appreciated.
EDIT: made some changes and new suggestions
What about a sliding window...
REMOVE LENGTH 2: (no other length has other matches)
//the lower case letters are the matches
ABCBAbabaBBCbcbcbVbvBCbcbcAB
__ABCBABABABBCBCBCBVBVBCBCBCAB
REMOVE LENGTH 1 (duplicate characters):
//* denote that a string was removed to prevent continual contraction
//of the string, unless this is what you want.
ABCBA*BbC*V*BC*AB
_ABCBA*BBC*V*BC*AB
RESULT:
ABCBA*B*C*V*BC*AB == ABCBABCVBCAB
This is of course starting with length=2, increase it to L/2 and iterate down.
I'm also thinking of two other approaches:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With