I have two lists of values of the same sortable type, which are sorted in ascending order, but (i) they don't have the same length, and (ii) entries present in one list can be missing from the other and vice versa. However I know that the majority of values in one list are present in the other, and that there are no duplicates in any list.
So we may have this situation:
list1 = [value1-0, value1-1, value1-2, value1-3]
list2 = [value2-0, value2-1, value2-2]
If it happens that the order of the values from both lists is:
value1-0 < (value1-1 = value2-0) < value2-1 < value1-2 < value1-3 < value2-2
we can give combined sorted value names to the values across the two lists, e.g.:
valueA < valueB < valueC < valueD < valueE < valueF
so that the two lists can be written as:
list1 = [valueA, valueB, valueD, valueE]
list2 = [valueB, valueC, valueF]
Given this I want the lists to become:
new_list1 = [valueA, valueB, "MISSING", valueD, valueE, "MISSING"]
new_list2 = ["MISSING", valueB, valueC, "MISSING", "MISSING", valueF ]
Can anyone help?
EDIT: The original question referred to datetime
objects in particular (hence the comments specific to datetime
s), but has been generalized to any sortable type.
How about something like this:
set1 = set(list1)
set2 = set(list2)
total = sorted(set1|set2)
new_list1 = [x if x in set1 else "MISSING" for x in total]
new_list2 = [x if x in set2 else "MISSING" for x in total]
This problem piqued my interest, so I wrote an overly generic solution.
Here's a function that
align_iterables(*inputs, missing_value=None)
if I didn't care about historical Python versions)import itertools
def align_iterables(inputs, missing=None):
"""Align sorted iterables
Yields tuples with values from the respective `inputs`, placing
`missing` if the value does not exist in the corresponding
iterable.
Example: align_generator('bc', 'bf', '', 'abf') yields:
(None, None, None, 'a')
('b', 'b', None, 'b')
('c', None, None, None)
(None, 'f', None, 'f')
"""
End = object()
iterators = [itertools.chain(i, [End]) for i in inputs]
values = [next(i) for i in iterators]
while not all(v is End for v in values):
smallest = min(v for v in values if v is not End)
yield tuple(v if v == smallest else missing for v in values)
values = [next(i) if v == smallest else v
for i, v in zip(iterators, values)]
# An adapter for this question's problem:
def align_two_lists(list1, list2, missing="MISSING"):
value = list(zip(*list(align_iterables([list1, list2], missing=missing))))
if not value:
return [[], []]
else:
a, b = value
return [list(a), list(b)]
# A set of tests for the question's problem:
if __name__ == '__main__':
assert align_two_lists('abcef', 'abcdef', '_') == [['a', 'b', 'c', '_', 'e', 'f'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('a', 'abcdef', '_') == [['a', '_', '_', '_', '_', '_'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('abcdef', 'a', '_') == [['a', 'b', 'c', 'd', 'e', 'f'], ['a', '_', '_', '_', '_', '_']]
assert align_two_lists('', 'abcdef', '_') == [['_', '_', '_', '_', '_', '_'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('abcdef', '', '_') == [['a', 'b', 'c', 'd', 'e', 'f'], ['_', '_', '_', '_', '_', '_']]
assert align_two_lists('ace', 'abcdef', '_') == [['a', '_', 'c', '_', 'e', '_'], ['a', 'b', 'c', 'd', 'e', 'f']]
assert align_two_lists('bdf', 'ace', '_') == [['_', 'b', '_', 'd', '_', 'f'], ['a', '_', 'c', '_', 'e', '_']]
assert align_two_lists('ace', 'bdf', '_') == [['a', '_', 'c', '_', 'e', '_'], ['_', 'b', '_', 'd', '_', 'f']]
assert align_two_lists('aaacd', 'acd', '_') == [['a', 'a', 'a', 'c', 'd'], ['a', '_', '_', 'c', 'd']]
assert align_two_lists('acd', 'aaacd', '_') == [['a', '_', '_', 'c', 'd'], ['a', 'a', 'a', 'c', 'd']]
assert align_two_lists('', '', '_') == [[], []]
list1 = ["datetimeA", "datetimeB", "datetimeD", "datetimeE"]
list2 = ["datetimeB", "datetimeC", "datetimeD", "datetimeF"]
new_list1 = ["datetimeA", "datetimeB", "MISSING", "datetimeD", "datetimeE", "MISSING"]
new_list2 = ["MISSING", "datetimeB", "datetimeC", "datetimeD", "MISSING", "datetimeF"]
assert align_two_lists(list1, list2) == [new_list1, new_list2]
# And some extra tests:
# Also test multiple generators
for expected, got in zip(
[(None, None, None, 'a'),
('b', 'b', None, 'b'),
('c', None, None, None),
(None, 'f', None, 'f')],
align_iterables(['bc', 'bf', '', 'abf'])):
assert expected == got
assert list(align_iterables([])) == []
# And an infinite generator
for expected, got in zip(
[(0, 0),
('X', 1),
(2, 2),
('X', 3),
(4, 4)],
align_iterables([itertools.count(step=2), itertools.count()], missing='X')):
assert expected == got
You could try:
new_list1=[]
new_list2=[]
i=j=0
while True:
print '1 ' + str(new_list1) +' '+str(i)
print '2 ' + str(new_list2) +' '+str(j)
if list1[i]==list2[j]:
new_list1 += [list1[i]]
new_list2 += [list2[j]]
i=i+1
j=j+1
elif list1[i]>list2[j]:
new_list1 += ["MISSING"]
new_list2 += [list2[j]]
j=j+1
else: # list1[i]<list2[j]
new_list1 += [list1[i]]
new_list2 += ["MISSING"]
i=i+1
if i>=len(list1) or j>=len(list2):
break
while i<len(list1):
new_list1 += [list1[i]]
new_list2 += ["MISSING"]
i=i+1
while j<len(list2):
new_list1 += ["MISSING"]
new_list2 += [list2[j]]
j=j+1
It looks like a lot of code but it should work and loops through the lists just once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With