My company's cat-herding application tracks a convoy of cats. Periodically, it needs to compare previousOrder
to currentOrder
(each is an ArrayList<Cat>
) and notify the cat-wranglers of any changes.
Each cat is unique and can only appear once in each list (or not at all). Most of the time, the previousOrder
and currentOrder
lists have the same contents, in the same order, but any of the following can happen (from more frequent to less frequent):
This appears like an edit distance problem to me. Ideally, I am looking for an algorithm that determines the steps required to make previousOrder
match currentOrder
:
Fluffy
to position 12
Snuggles
at position 37
Mr. Chubbs
The algorithm should also recognize scenario #1, in which case the new order is communicated in its entirety.
What's the best approach for this?
(This post and that post pose similar questions, but they are both dealing with sorted lists. Mine are ordered, but unsorted.)
EDIT
The Levenshtein algorithm is a great suggestion, but I'm concerned about the time/space requirement of creating a matrix. My main goal is to determine and communicate the changes as quickly as possible. Something that is faster than finding the additions and sending message along the lines of "Here are the new cats, and here is the current order."
Java provides a method for comparing two Array List. The ArrayList. equals() is the method used for comparing two Array List. It compares the Array lists as, both Array lists should have the same size, and all corresponding pairs of elements in the two Array lists are equal.
Java equals() method of List interface compares the specified object with the list for equality. It overrides the equals() method of Object class. This method accepts an object to be compared for equality with the list. It returns true if the specified object is equal to the list, else returns false.
Here's an algorithm I put together to merge two lists, old
and new
. It's not the most elegant or efficient, but it seems to work okay for the data I'm using it for.
new
is the most updated list of data, and old
is the out-of-date list that needs to get transformed into new
. The algorithm performs its operations on the old
list - removing, moving, and inserting items accordingly.
for(item in old)
if (new does not contain item)
remove item from old
for(item in new)
if (item exists in old)
if (position(item, old) == position(item, new))
continue // next loop iteration
else
move old item to position(item, new)
else
insert new item into old at position(item, new)
The deletions are all done up front to make the positions of the items more predictable in the second loop.
The driving force behind this was to sync a list of data from the server with <table>
rows in a browser DOM (using javascript). It was needed because we didn't want to redraw the entire table whenever the data changed; the differences between the lists were likely to be small and only affect one or two rows. It may not be the algorithm you're looking for for your data. If not, let me know and I'll delete this.
There are probably some optimizations that could be made for this. But it is performant and predictable enough for me and the data I'm working with.
Levenshtein distance metric.
http://www.levenshtein.net/
An efficient way to solve this is by using dynamic programming. Wikipedia has pseudo-code for a closely related problem: Computing Levenshtein distance.
Keeping track of the actual operations and incorporating the "scramble" operation shouldn't be too difficult.
I know the questioner was seeking a Java solution, but I came across this question whilst seeking an algorithm to implement in C#.
Here's my solution, which generates an enumeration of simple IListDifference values: either ItemAddedDifference, ItemRemovedDifference or ItemMovedDifference.
It uses a working copy of the source list to establish, item by item, what modifications are necessary to transform it to match the target list.
public class ListComparer<T>
{
public IEnumerable<IListDifference> Compare(IEnumerable<T> source, IEnumerable<T> target)
{
var copy = new List<T>(source);
for (var i = 0; i < target.Count(); i++)
{
var currentItemsMatch = false;
while (!currentItemsMatch)
{
if (i < copy.Count && copy[i].Equals(target.ElementAt(i)))
{
currentItemsMatch = true;
}
else if (i == copy.Count())
{
// the target item's index is at the end of the source list
copy.Add(target.ElementAt(i));
yield return new ItemAddedDifference { Index = i };
}
else if (!target.Skip(i).Contains(copy[i]))
{
// the source item cannot be found in the remainder of the target, therefore
// the item in the source has been removed
copy.RemoveAt(i);
yield return new ItemRemovedDifference { Index = i };
}
else if (!copy.Skip(i).Contains(target.ElementAt(i)))
{
// the target item cannot be found in the remainder of the source, therefore
// the item in the source has been displaced by a new item
copy.Insert(i, target.ElementAt(i));
yield return new ItemAddedDifference { Index = i };
}
else
{
// the item in the source has been displaced by an existing item
var sourceIndex = i + copy.Skip(i).IndexOf(target.ElementAt(i));
copy.Insert(i, copy.ElementAt(sourceIndex));
copy.RemoveAt(sourceIndex + 1);
yield return new ItemMovedDifference { FromIndex = sourceIndex, ToIndex = i };
}
}
}
// Remove anything remaining in the source list
for (var i = target.Count(); i < copy.Count; i++)
{
copy.RemoveAt(i);
yield return new ItemRemovedDifference { Index = i };
}
}
}
Just noticed this makes use of a custom extension method on IEnumerable - 'IndexOf':
public static class EnumerableExtensions
{
public static int IndexOf<T>(this IEnumerable<T> list, T item)
{
for (var i = 0; i < list.Count(); i++)
{
if (list.ElementAt(i).Equals(item))
{
return i;
}
}
return -1;
}
}
I recently had to do this, except items could exist multiple times. This complicated things, but I was able to do it using look-ahead counters and some other craziness. It looks a lot like Rob's solution, so thanks to him for getting me started!
First off, let's assume that we want to return the list of operations that will transform the first list into the second:
public interface Operation {
/**
* Apply the operation to the given list.
*/
void apply(List<String> keys);
}
and we have some helper methods to construct operations. You actually don't need a "move" operation, and you could even have a "swap" as well (or instead), but this is what I went with:
Operation delete(int index) { ... }
Operation insert(int index, String key) { ... }
Operation move(int from, int to) { ... }
Now we'll define a special class to hold our look-ahead counts:
class Counter {
private Map<String, Integer> counts;
Counter(List<String> keys) {
counts = new HashMap<>();
for (String key : keys) {
if (counts.containsKey(key)) {
counts.put(key, counts.get(key) + 1);
} else {
counts.put(key, 1);
}
}
}
public int get(String key) {
if (!counts.containsKey(key)) {
return 0;
}
return counts.get(key);
}
public void dec(String key) {
counts.put(key, counts.get(key) - 1);
}
}
And a helper method to get the index of the next key in the list:
int next(List<String> list, int start, String key) {
for (int i = start; i < list.size(); i++) {
if (list.get(i).equals(key)) {
return i;
}
}
throw new RuntimeException("next index not found for " + key);
}
Now we're ready to do the transform:
List<Operation> transform(List<String> from, List<String> to) {
List<Operation> operations = new ArrayList<>();
// make our own copy of the first, that we can mutate
from = new ArrayList<>(from);
// maintain lookahead counts
Counter fromCounts = new Counter(from);
Counter toCounts = new Counter(to);
// do all our deletes first
for (int i = 0; i < from.size(); i++) {
String current = from.get(i);
if (fromCounts.get(current) > toCounts.get(current)) {
Operation op = delete(i);
operations.add(op);
op.apply(from);
fromCounts.dec(current);
i--;
}
}
// then one more iteration for the inserts and moves
for (int i = 0; i < to.size(); i++) {
String current = to.get(i);
if (from.size() > i && from.get(i).equals(current)) {
fromCounts.dec(current);
continue;
}
if (fromCounts.get(current) > 0) {
Operation op = move(next(from, i + 1, current), i);
operations.add(op);
op.apply(from);
fromCounts.dec(current);
} else {
Operation op = insert(i, current);
operations.add(op);
op.apply(from);
}
}
return operations;
}
It's a bit tricky to get your head around, but basically you do the deletes so that you know for every key you are either inserting or moving. Then you run through the list again and if there's enough, you move one from the part of the list your haven't seen yet, otherwise insert. By the time you get to the end, it all lines up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With