Given 3 lists which are arbitrarily sorted by the same but unknown sort order. Is there an algorithm that merges these lists into one which is then still sorted by the same order?
Example:
List1: a b c f h
List2: b c e h
List3: c d e f
Assume these lists are sorted but the sort order used is not known. I want to combine these lists to a result that does not contain duplicates but still maintains the sort order: a b c d e f h
As said above: It is known, that the given lists are sorted but it is not known by which order, but the requirement is that the merged list is still sorted by the same (unknown) order.
In the example above, I know that the element "f" is positioned between "e" and "h" because from List1 I know that
"c" < "f" < "h",
from List2 I know that
"c" < "e" < "h"
and from List3 I know that
"e" < "f" and "c" < "e"
which combines to:
"c" < "e" < "f" < "h"
If the sort order cannot be determined by any of the given lists, it is permissible to just append the element to the end of the result list. Also, if the sort order of a sequence of elements cannot be determined it is permissible to insert them in any order into the list as long as they are in the right place (e.g. if I know that "b" and "c" must be inserted between "a" and "d" but I don't know if it should be a b c d or a c b d, then both are permissible.)
Of course this is just an example. The real lists are longer (but contain less than 100 elements), contain not single but multiple character elements and the sort order is not alphabetic. Also, I have got up to 5 lists.
I need to implement this algorithm in Delphi (and no: This is not homework but a real life problem), but I take an algorithm in an language provided it does not contain too much compiler magic or complex library functions.
Performance is not much of an issue because this is done once.
In computer science, k-way merge algorithms or multiway merges are a specific type of sequence merge algorithms that specialize in taking in k sorted lists and merging them into a single sorted list. These merge algorithms generally refer to merge algorithms that take in a number of sorted lists greater than two.
A merge sort uses a technique called divide and conquer. The list is repeatedly divided into two until all the elements are separated individually. Pairs of elements are then compared, placed into order and combined. The process is then repeated until the list is recompiled as a whole.
Conceptually, a merge sort works as follows: Divide the unsorted list into n sublists, each containing one element (a list of one element is considered sorted). Repeatedly merge sublists to produce new sorted sublists until there is only one sublist remaining. This will be the sorted list.
Your input lists define a partial order of your items. According to an answer at Math.SE, what you want is a topological sort. Algorithms are described on Wikipedia.
Nice question. Although topological sort may be the most recommended method, you would have to parse the input first to build up the dependencies list. I thought of a more directed approach, based on finding the items that occur in multiple lists for setting up the order definition.
I am unable to predict anything on time complexity, but since you do not care about performance and especially considering the total number of items being 500 at maximum, I think this algorithm should work nicely.
type
TSorterStringList = class(TStringList)
protected
Id: Integer;
KeyId: Integer;
function Current: String;
public
constructor Create;
end;
TSorterStringLists = class(TObjectList)
private
function GetItem(Index: Integer): TSorterStringList;
public
property Items[Index: Integer]: TSorterStringList read GetItem; default;
end;
TSorter = class(TObject)
private
FInput: TSorterStringLists;
FKeys: TStringList;
procedure GenerateKeys;
function IsKey(const S: String): Boolean;
public
constructor Create;
destructor Destroy; override;
procedure Sort(Output: TStrings);
property Input: TSorterStringLists read FInput;
end;
{ TSorterStringList }
constructor TSorterStringList.Create;
begin
inherited Create;
KeyId := -1;
end;
function TSorterStringList.Current: String;
begin
Result := Strings[Id];
end;
{ TSorterStringLists }
function TSorterStringLists.GetItem(Index: Integer): TSorterStringList;
begin
if Index >= Count then
Count := Index + 1;
if inherited Items[Index] = nil then
inherited Items[Index] := TSorterStringList.Create;
Result := TSorterStringList(inherited Items[Index]);
end;
{ TSorter }
constructor TSorter.Create;
begin
inherited Create;
FInput := TSorterStringLists.Create(True);
FKeys := TStringList.Create;
end;
destructor TSorter.Destroy;
begin
FKeys.Free;
FInput.Free;
inherited Destroy;
end;
threadvar
CurrentSorter: TSorter;
function CompareKeys(List: TStringList; Index1, Index2: Integer): Integer;
var
Input: TSorterStringLists;
I: Integer;
J: Integer;
K: Integer;
begin
Result := 0;
Input := CurrentSorter.Input;
for I := 0 to Input.Count - 1 do
begin
J := Input[I].IndexOf(List[Index1]);
K := Input[I].IndexOf(List[Index2]);
if (J > - 1) and (K > -1) then
begin
Result := J - K;
Break;
end;
end;
end;
procedure TSorter.GenerateKeys;
var
All: TStringList;
I: Integer;
begin
All := TStringList.Create;
try
All.Sorted := True;
All.Duplicates := dupAccept;
for I := 0 to FInput.Count - 1 do
All.AddStrings(FInput[I]);
for I := 0 to All.Count - 2 do
if (All[I] = All[I + 1]) then
if (FKeys.Count = 0) or (FKeys[FKeys.Count - 1] <> All[I]) then
FKeys.Add(All[I]);
finally
All.Free;
end;
CurrentSorter := Self;
FKeys.CustomSort(CompareKeys);
end;
function TSorter.IsKey(const S: String): Boolean;
begin
Result := FKeys.IndexOf(S) > -1;
end;
procedure TSorter.Sort(Output: TStrings);
var
KeyId: Integer;
I: Integer;
List: TSorterStringList;
begin
if FInput.Count = 0 then
Exit;
Output.BeginUpdate;
try
GenerateKeys;
for KeyId := -1 to FKeys.Count - 1 do
begin
for I := 0 to FInput.Count - 1 do
begin
List := FInput[I];
if List.KeyId <= KeyId then
while (List.Id < List.Count) and not IsKey(List.Current) do
begin
Output.Add(List.Current);
Inc(List.Id);
end;
while (List.Id < List.Count) and IsKey(List.Current) do
begin
List.KeyId := FKeys.IndexOf(List.Current);
Inc(List.Id);
end;
end;
if KeyId + 1 < FKeys.Count then
Output.Add(FKeys[KeyId + 1]);
end;
finally
Output.EndUpdate;
end;
end;
procedure TForm1.Button1Click(Sender: TObject);
var
Sorter: TSorter;
begin
Sorter := TSorter.Create;
try
Sorter.Input[0].CommaText := '1, 2, 4, 9, 10, 11, 22, 46, 48, 51, 70, 72';
Sorter.Input[1].CommaText := '3, 9, 23, 43, 44, 45, 47, 48, 51, 71, 90, 91';
Sorter.Input[2].CommaText := '0, 3, 4, 7, 8, 11, 23, 50, 51, 52, 55, 70';
Sorter.Input[3].CommaText := '2, 6, 14, 15, 36, 37, 38, 39, 51, 65, 66, 77';
Sorter.Input[4].CommaText := '5, 27, 120, 130';
ListBox1.Items.Assign(Sorter.Input[0]);
ListBox2.Items.Assign(Sorter.Input[1]);
ListBox3.Items.Assign(Sorter.Input[2]);
ListBox4.Items.Assign(Sorter.Input[3]);
ListBox5.Items.Assign(Sorter.Input[4]);
Sorter.Sort(ListBox6.Items);
// Results in:
// 1, 0, 5, 27, 120, 130, 3, 2, 6, 14, 15, 36, 37, 38, 39, 4, 7, 8, 9, 10,
// 11, 22, 46, 23, 43, 44, 45, 47, 50, 48, 51, 71, 90, 91, 52, 55, 65, 66,
// 77, 70, 72
finally
Sorter.Free;
end;
end;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With