Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm of Minimum steps to transform a list to the desired array. (Using InsertAt and DeleteAt Only)

Situation

To begin with, you have an array/list A, then you want to transform it to an expected array/list B given. The only actions that you can apply on array are InsertAt and DeleteAt where they are able to insert and delete an element at certain index from list.

note: Array B is always sorted while Array A may not be.

For instance, you have an array A of [1, 4, 3, 6, 7]

and you want it to become [2, 3, 4, 5, 6, 6, 7, 8]

One way of doing that is let A undergo following actions:

    deleteAt(0); // which will delete element 1, arrayA now [4, 3, 6, 7]
    deleteAt(0); // delete element 4 which now at index 0
                 // array A now [3, 6, 7]
    insertAt(0, 2); // Insert value to at index 0 of array A
                    // array A now [2, 3, 6, 7]
    insertAt(2, 4); // array now [2, 3, 4, 6, 7]
    insertAt(3, 5); // array Now [2, 3, 4, 5, 6, 7]
    insertAt(5, 6); // array Now [2, 3, 4, 5, 6, 6, 7]
    insertAt(7, 8); // array Now [2, 3, 4, 5, 6, 6, 7, 8]

On the above example, 7 operations were done on array A to transform it to array we wanted.

Hence, how do we find the what are the steps to transform A to B, as well as the minimum steps? Thanks!

btw, solution of deleting all element at A then add everything from B to A is only applicable when A & B have nothing in common.

My thoughts

What I have done so far:

  1. Compare the array A and array B, then save delete all the elements in Array A that can't be found in array B.
  2. Find the longest increasing subsequence from the common list of A and B.
  3. delete All elements that are not in Longest increasing sub sequence.
  4. compare what is left with B, then add elements accordingly.

However, I'm struggling from implementing that..

Change log

  1. fixed a typo of missing out element 7, now least operation is 7.
  2. Added MY THOUGHTS section
  3. There was a answer that elaborated on Levenshtein distance (AKA min edit distance), somehow it disappeared.. But I found that really useful after reading git/git levenshtein.c file, it seems to be a faster algorithm then what I already have. However, I'm not sure will that algorithm give me the detailed steps, or it is only capable of giving min num of steps.
like image 373
billz Avatar asked Oct 18 '22 20:10

billz


2 Answers

I have a python program that seems to work, but it is not very short

__version__ = '0.2.0'

class Impossible(RuntimeError): pass

deleteAt = 'deleteAt'
insertAt = 'insertAt' 
incOffset = 'incOffset'

def remove_all(size):
    return [(deleteAt, i, None) for i in range(size-1, -1, -1)]

def index_not(seq, val):
    for i, x in enumerate(seq):
        if x != val:
            return i
    return len(seq)

def cnt_checked(func):
    """Helper function to check some function's contract"""
    from functools import wraps
    @wraps(func)
    def wrapper(src, dst, maxsteps):
        nsteps, steps = func(src, dst, maxsteps)
        if nsteps > maxsteps:
            raise RuntimeError(('cnt_checked() failed', maxsteps, nsteps))
        return nsteps, steps
    return wrapper

@cnt_checked
def strategy_1(src, dst, maxsteps):
    # get dst's first value from src
    val = dst[0]
    try:
        index = src.index(val)
    except ValueError:
        raise Impossible

    # remove all items in src before val's first occurrence
    left_steps = remove_all(index)
    src = src[index:]
    n = min(index_not(src, val), index_not(dst, val))
    score = len(left_steps)
    assert n > 0
    left_steps.append([incOffset, n, None])
    right_steps = [[incOffset, -n, None]]
    nsteps, steps = rec_find_path(src[n:], dst[n:], maxsteps - score)
    return (score + nsteps, (left_steps + steps + right_steps))

@cnt_checked
def strategy_2(src, dst, maxsteps):
    # do not get dst's first value from src
    val = dst[0]
    left_steps = []
    src = list(src)
    for i in range(len(src)-1, -1, -1):
        if src[i] == val:
            left_steps.append((deleteAt, i, None))
            del src[i]
    n = index_not(dst, val)
    right_steps = [(insertAt, 0, val) for i in range(n)]
    dst = dst[n:]
    score = len(left_steps) + len(right_steps)
    nsteps, steps = rec_find_path(src, dst, maxsteps - score)
    return (score + nsteps, (left_steps + steps + right_steps))

@cnt_checked
def rec_find_path(src, dst, maxsteps):

    if maxsteps <= 0:
        if (maxsteps == 0) and (src == dst):
            return (0, [])
        else:
            raise Impossible

    # if destination is empty, clear source
    if not dst:
        if len(src) > maxsteps:
            raise Impossible
        steps = remove_all(len(src))
        return (len(steps), steps)

    found = False
    try:
        nsteps_1, steps_1 = strategy_1(src, dst, maxsteps)
    except Impossible:
        pass
    else:
        found = True
        maxsteps = nsteps_1 - 1
    try:
        nsteps_2, steps_2 = strategy_2(src, dst, maxsteps)
    except Impossible:
        if found:
            return (nsteps_1, steps_1)
        else:
            raise
    else:
        return (nsteps_2, steps_2)

def find_path(A, B):
    assert B == list(sorted(B))
    maxsteps = len(A) + len(B)
    nsteps, steps = rec_find_path(A, B, maxsteps)
    result = []
    offset = 0
    for a, b, c in steps:
        if a == incOffset:
            offset += b
        else:
            result.append((a, b + offset, c))
    return result

def check(start, target, ops):
    """Helper function to check correctness of solution"""
    L = list(start)
    for a, b, c in ops:
        print(L)
        if a == insertAt:
            L.insert(b, c)
        elif a == deleteAt:
            del L[b]
        else:
            raise RuntimeError(('Unexpected op:', a))
    print(L)
    if L != target:
        raise RuntimeError(('Result check failed, expected', target, 'got:', L))

start = [1, 4, 3, 6, 7]
target = [2, 3, 4, 5, 6, 6, 7, 8]

ops = find_path(start, target)
print(ops)

check(start, target, ops)

After some tests with this code, it is now obvious that the result is a two phases operation. There is a first phase where items are deleted from the initial list, all but a increasing sequence of items all belonging to the target list (with repetition). Missing items are then added to list until the target list is built.

The temporary conclusion is that if we find an algorithm to determine the longest subsequence of items from the target list initially present in the first list, in the same order but not necessarily contiguous, then it gives the shortest path. This is a new and potentially simpler problem. This is probably what you meant above, but it is much clearer from the program's output.

It seems almost obvious that this problem can be reduced to the problem of the longest increasing subsequence

like image 54
Gribouillis Avatar answered Oct 21 '22 05:10

Gribouillis


We can prove quite easily that the problem reduces to the longest non-decreasing subsequence by noting that if there were a collection of elements in A that did not merit deletion and was greater in number than the longest non-decreasing subsequence in A of elements also in B, then all the elements of this collection must exist in the same order in B, which means it's a longer non-decreasing subsequence, and contradicts our assertion. Additionally, any smaller collection in A that does not merit deletion, necessarily exists in B in the same order and is therefore part of the longest non-decreasing subsequence in A of elements also in B.

The algorithm then is reduced to the longest increasing subsequence problem, O(n log n + m):

(1) Find the longest non-decreasing subsequence of elements
    in A that have at least the same count in B.

(2) All other items in A are to be deleted and the 
    missing elements from B added.
like image 25
גלעד ברקן Avatar answered Oct 21 '22 05:10

גלעד ברקן