Hey I am working on a very high performance file-managing/analyzing toolkit written in python. I want to create a function that gives me a list or something like that in a tree format. Something like in this question (java-related)
From:
dir/file
dir/dir2/file2
dir/file3
dir3/file4
dir3/file5
Note: the list of paths is unsorted
To:
dir/
    file
    dir2/
        file2
    file3
dir3/
    file4
    file5
[[dir, [file, [dir2, [file2]], file3]], [dir3, [file4, file5]]]
something along those lines. I've been playing around with some ideas but none of them provided the speed that I would like to have.
Note: I do already have the list of paths, so no worrying about that. The function takes paths list and gives tree list.
Thanks in Advance
Now that you clarified the question a bit more, I guess the following is what you want:
from collections import defaultdict
input_ = '''dir/file
dir/dir2/file2
dir/file3
dir2/alpha/beta/gamma/delta
dir2/alpha/beta/gamma/delta/
dir3/file4
dir3/file5'''
FILE_MARKER = '<files>'
def attach(branch, trunk):
    '''
    Insert a branch of directories on its trunk.
    '''
    parts = branch.split('/', 1)
    if len(parts) == 1:  # branch is a file
        trunk[FILE_MARKER].append(parts[0])
    else:
        node, others = parts
        if node not in trunk:
            trunk[node] = defaultdict(dict, ((FILE_MARKER, []),))
        attach(others, trunk[node])
def prettify(d, indent=0):
    '''
    Print the file tree structure with proper indentation.
    '''
    for key, value in d.iteritems():
        if key == FILE_MARKER:
            if value:
                print '  ' * indent + str(value)
        else:
            print '  ' * indent + str(key)
            if isinstance(value, dict):
                prettify(value, indent+1)
            else:
                print '  ' * (indent+1) + str(value)
main_dict = defaultdict(dict, ((FILE_MARKER, []),))
for line in input_.split('\n'):
    attach(line, main_dict)
prettify(main_dict)
It outputs:
dir3
  ['file4', 'file5']
dir2
  alpha
    beta
      gamma
        ['delta']
        delta
          ['']
dir
  dir2
    ['file2']
  ['file', 'file3']
A few thing to note:
main_dict['dir2']['alpha']['beta']....../delta and .../delta/. I thought this was helpful for you to be able to quickly differenciate between your leaf being a directory or a file.I hope this answers your question. If anything is unclear, post a comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With