Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group list while recursive looping

This is my sample input and output:

l = [
    ['random_str0', 'random_str', 'random_str'],
    ['random_str1', '', 'random_str'],
    ['random_str2', '', ''],
    ['random_str3', 'random_str', 'random_str'],
    ['random_str4', '', ''],
    ['random_str5', '', ''],
    ['random_str6', 'random_str', ''],
    ['random_str7', 'random_str', 'random_str'],
    ['random_str8', '', ''],
    ['random_str9', '', ''],
    ['random_str10', '', ''],
    ['random_str11', '', ''],
]


out = [ # something like this. data structure and type and order are not important
    ['random_str0', 'random_str', 'random_str'],
    [
        ['random_str1', '', 'random_str']
        ['random_str2', '', '']
    ],
    [
        ['random_str3', 'random_str', 'random_str'],
        ['random_str4', '', ''],
        ['random_str5', '', '']
    ],
    ['random_str6', 'random_str', ''],
    [
        ['random_str7', 'random_str', 'random_str'],
        ['random_str8', '', ''],
        ['random_str9', '', ''],
        ['random_str10', '', ''],
        ['random_str11', '', '']
    ]
]

The idea if any inner list is having either index 1 or 2 value is followed by one or more list having missing index 1 and 2 value, it forms a group. (My actual code is more complex and have other conditions as well but for brevity, it is omitted as it is not part of the actual question.

This is how I tried.

for n in reversed(range(1, 5)):
    for i in range(len(l)-n):
        group = [l[i+j] for j in range(n+1)]
        if (
            (group[0][1] or group[0][2]) and
            all([not (g[1] and g[2]) for g in group[1:]])
        ):
            print(group)

Out: # not desired as it is overlapping.
[
    ['random_str7', 'random_str', 'random_str'],
    ['random_str8', '', ''],
    ['random_str9', '', ''],
    ['random_str10', '', ''],
    ['random_str11', '', '']
]
[
    ['random_str7', 'random_str', 'random_str'],
    ['random_str8', '', ''],
    ['random_str9', '', ''],
    ['random_str10', '', '']
]
[
    ['random_str3', 'random_str', 'random_str'],
    ['random_str4', '', ''],
    ['random_str5', '', '']
]
[
    ['random_str7', 'random_str', 'random_str'],
    ['random_str8', '', ''],
    ['random_str9', '', '']
]
[
    ['random_str1', 'random_str', 'random_str'],
    ['random_str2', '', '']
]
[
    ['random_str3', 'random_str', 'random_str'],
    ['random_str4', '', '']
]
[
    ['random_str7', 'random_str', 'random_str'],
    ['random_str8', '', '']
]

The Question is how can I track record such that the grouping doesn't overlap. I think recursive looping will help but I don't know how to accomplish that.

the final data structure is not needed to be a list. I tried it with dicts but the code becomes more complicated.

for more clarification, I created step by step pastebin https://pastebin.com/qeWbxheK

like image 578
Rahul Avatar asked Mar 04 '23 17:03

Rahul


1 Answers

With single loop:

import pprint

res = []
for sub_l in lst:   # lst is your initial list
    if sub_l[1] or sub_l[2]:
        res.append(sub_l)   # add as a base item of the group
    elif not sub_l[1] and not sub_l[2] and res:
        # check if last item is not a 2-dimensional list yet
        if not isinstance(res[-1][0], list): res[-1] = [res[-1]]
        res[-1].append(sub_l)

pprint.pprint(res)

The output:

[['random_str0', 'random_str', 'random_str'],
 [['random_str1', '', 'random_str'], ['random_str2', '', '']],
 [['random_str3', 'random_str', 'random_str'],
  ['random_str4', '', ''],
  ['random_str5', '', '']],
 ['random_str6', 'random_str', ''],
 [['random_str7', 'random_str', 'random_str'],
  ['random_str8', '', ''],
  ['random_str9', '', ''],
  ['random_str10', '', ''],
  ['random_str11', '', '']]]
like image 173
RomanPerekhrest Avatar answered Mar 07 '23 02:03

RomanPerekhrest