Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing nested indented text into lists

Parsing nested indented text into lists

Hi,

maybe someone can give me a start help.

I have nested indented txt similar to this. I should parse that into a nested list structure like

TXT = r"""
Test1
    NeedHelp
        GotStuck
            Sometime
            NoLuck
    NeedHelp2
        StillStuck
        GoodLuck
"""

Nested_Lists = ['Test1', 
    ['NeedHelp', 
        ['GotStuck', 
            ['Sometime', 
            'NoLuck']]], 
    ['NeedHelp2', 
        ['StillStuck', 
        'GoodLuck']]
]

Nested_Lists = ['Test1', ['NeedHelp', ['GotStuck', ['Sometime', 'NoLuck']]], ['NeedHelp2', ['StillStuck', 'GoodLuck']]]

Any help for python3 would be appriciated

like image 350
user3426681 Avatar asked Mar 22 '26 17:03

user3426681


1 Answers

You could exploit Python tokenizer to parse the indented text:

from tokenize import NAME, INDENT, DEDENT, tokenize

def parse(file):
    stack = [[]]
    lastindent = len(stack)

    def push_new_list():
        stack[-1].append([])
        stack.append(stack[-1][-1])
        return len(stack)

    for t in tokenize(file.readline):
        if t.type == NAME:
            if lastindent != len(stack):
                stack.pop()
                lastindent = push_new_list()
            stack[-1].append(t.string) # add to current list
        elif t.type == INDENT:
            lastindent = push_new_list()
        elif t.type == DEDENT:
            stack.pop()
    return stack[-1]

Example:

from io import BytesIO
from pprint import pprint
pprint(parse(BytesIO(TXT.encode('utf-8'))), width=20)

Output

['Test1',
 ['NeedHelp',
  ['GotStuck',
   ['Sometime',
    'NoLuck']]],
 ['NeedHelp2',
  ['StillStuck',
   'GoodLuck']]]
like image 169
jfs Avatar answered Mar 24 '26 09:03

jfs