This is my sample input and output:
l = [
['random_str0', 'random_str', 'random_str'],
['random_str1', '', 'random_str'],
['random_str2', '', ''],
['random_str3', 'random_str', 'random_str'],
['random_str4', '', ''],
['random_str5', '', ''],
['random_str6', 'random_str', ''],
['random_str7', 'random_str', 'random_str'],
['random_str8', '', ''],
['random_str9', '', ''],
['random_str10', '', ''],
['random_str11', '', ''],
]
out = [ # something like this. data structure and type and order are not important
['random_str0', 'random_str', 'random_str'],
[
['random_str1', '', 'random_str']
['random_str2', '', '']
],
[
['random_str3', 'random_str', 'random_str'],
['random_str4', '', ''],
['random_str5', '', '']
],
['random_str6', 'random_str', ''],
[
['random_str7', 'random_str', 'random_str'],
['random_str8', '', ''],
['random_str9', '', ''],
['random_str10', '', ''],
['random_str11', '', '']
]
]
The idea if any inner list is having either index 1 or 2 value is followed by one or more list having missing index 1 and 2 value, it forms a group. (My actual code is more complex and have other conditions as well but for brevity, it is omitted as it is not part of the actual question.
This is how I tried.
for n in reversed(range(1, 5)):
for i in range(len(l)-n):
group = [l[i+j] for j in range(n+1)]
if (
(group[0][1] or group[0][2]) and
all([not (g[1] and g[2]) for g in group[1:]])
):
print(group)
Out: # not desired as it is overlapping.
[
['random_str7', 'random_str', 'random_str'],
['random_str8', '', ''],
['random_str9', '', ''],
['random_str10', '', ''],
['random_str11', '', '']
]
[
['random_str7', 'random_str', 'random_str'],
['random_str8', '', ''],
['random_str9', '', ''],
['random_str10', '', '']
]
[
['random_str3', 'random_str', 'random_str'],
['random_str4', '', ''],
['random_str5', '', '']
]
[
['random_str7', 'random_str', 'random_str'],
['random_str8', '', ''],
['random_str9', '', '']
]
[
['random_str1', 'random_str', 'random_str'],
['random_str2', '', '']
]
[
['random_str3', 'random_str', 'random_str'],
['random_str4', '', '']
]
[
['random_str7', 'random_str', 'random_str'],
['random_str8', '', '']
]
The Question is how can I track record such that the grouping doesn't overlap. I think recursive looping will help but I don't know how to accomplish that.
the final data structure is not needed to be a list. I tried it with dicts but the code becomes more complicated.
for more clarification, I created step by step pastebin https://pastebin.com/qeWbxheK
With single loop:
import pprint
res = []
for sub_l in lst: # lst is your initial list
if sub_l[1] or sub_l[2]:
res.append(sub_l) # add as a base item of the group
elif not sub_l[1] and not sub_l[2] and res:
# check if last item is not a 2-dimensional list yet
if not isinstance(res[-1][0], list): res[-1] = [res[-1]]
res[-1].append(sub_l)
pprint.pprint(res)
The output:
[['random_str0', 'random_str', 'random_str'],
[['random_str1', '', 'random_str'], ['random_str2', '', '']],
[['random_str3', 'random_str', 'random_str'],
['random_str4', '', ''],
['random_str5', '', '']],
['random_str6', 'random_str', ''],
[['random_str7', 'random_str', 'random_str'],
['random_str8', '', ''],
['random_str9', '', ''],
['random_str10', '', ''],
['random_str11', '', '']]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With