Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Build graph of organizational structure

I'm having trouble writing an algo that can read a CSV of employee/managers, and output a directed graph featuring the employee/manager relationships.

My foobar example: given the following CSV file

john,
jill, john
tom, john
tim, jill
felisa, tom
ray, tom
bob, tim
jim, tim
pam, felisa
ben, ray
james, ray
mike, pam
rashad, ben
henry, james

How can I build a DiGraph such that the following organizational structure can be shown:

         john
        /    \
     jill    tom
     /       /  \
   tim   felisa  ray
 /  \      /     /  \
bob jim   pam  ben  james
          /     /       \
        mike   rashad    henry

Obviously this is a graph problem, but I'm having trouble deciding which structure(s) to use (e.g., would it be best to use a dict or to build a custom OrganizationalGraph object, etc). Any help is appreciated.

The language of choice isn't really important (though we can just say Python for simplicity [updated tags accordingly]), I'm more so just trying to understand the fundamentals of this type of problem (i.e., recursion vs. iteration, using set() to store a manager's direct reports vs using only abstracted data structures). Finally, no, using any package outside of the standard library is a non-starter.

like image 875
semore_1267 Avatar asked Feb 05 '19 07:02

semore_1267


People also ask

How do I make an organizational chart for free?

There are a variety of programs you can use to create your own org chart, like SmartDraw, Lucidchart and Microsoft, but Visme is the best program to use. With a variety of org chart templates and tools to choose from, it's easy to create a stunning, easy-to-follow organizational chart in minutes.

What is the easiest program to create an organizational chart?

Microsoft Excel, PowerPoint, and Outlook all use the same SmartArt tool as Word and so you can use those programs to create org charts, as well. Outside of the Microsoft Office family, you can use Visio, LucidChart, OrgPlus, OrgWeaver, Pingboard, OrgChart4U, and others.

What is the best program to create an organizational chart free?

GitMind is a free organizational chart maker that offers an efficient way to create org charts. Choose from its library of templates to instantly build an org chart. Take advantage of unlimited access to a wide range of shapes and icons for customization.


2 Answers

To build the graph we are given the following information:

  1. The root (in this case John)
  2. A list of edges in the form (child, parent)
  3. Each node has a maximum of two children (implied from your example, however the code below works for any node having an arbitrary amount of children)

Note that in your question's example csv data you seem to have mispelled felisa as felia. As a result these outputs aren't for the actual data you put up but rather the corrected version. First we parse the csv file, extracting the root and a list of edges:

import csv

with open('data.csv') as f:
    f = list(csv.reader(f, skipinitialspace=True))
    root, *edges = f
    root = root[0]

print(root)
print(edges)

Output:

john 
[['jill', 'john'], ['tom', 'john'], ['tim', 'jill'], ['felisa', 'tom'], ['ray', 'tom'], ['bob', 'tim'], ['jim', 'tim'], ['pam', 'felisa'], ['ben', 'ray'], ['james', 'ray'], ['mike', 'pam'], ['rashad', 'ben'], ['henry', 'james']]

We use a defaultdict from collections (standard library) to represent a graph. We use the key in the dictionary to represent the parent / manager, and the value to represent a list of children / employees:

from collections import defaultdict

graph = defaultdict(list)
for child, parent in edges:
    graph[parent].append(child)

print(graph)

Output:

defaultdict(<class 'list'>, {'john': ['jill', 'tom'], 'jill': ['tim'], 'tom': ['felisa', 'ray'], 'tim': ['bob', 'jim'], 'felisa': ['pam'], 'ray': ['ben', 'james'], 'pam': ['mike'], 'ben': ['rashad'], 'james': ['henry']})

This structure lets us get a list of children of a node with graph[node]. We know that the root of the tree is the node that isn't present in any of the values in any of the lists. We also saved the root earlier.

I took "how can I build a DiGraph such that the following organizational structure can be shown" quite literally. Here's an example of how we can traverse this graph structure to build a string representation:

res = ''
stack = [(root, 0)]
needed_lines = defaultdict(int)

while stack:
    node, level = stack.pop()
    prev_level = level-4
    res += '\n' + ''.join('|' if i in needed_lines else
                          ' ' if i <= level-4 else
                          '-' for i in range(level)) + node
    for child in graph[node]:
        stack.append((child, level+4))

    needed_lines[level] += len(graph[node])
    needed_lines[prev_level] -=1
    if needed_lines[prev_level] == 0: del needed_lines[prev_level]

print(res)

Output:

john
|---tom
|   |---ray
|   |   |---james
|   |   |   |---henry
|   |   |---ben
|   |       |---rashad
|   |---felisa
|       |---pam
|           |---mike
|---jill
    |---tim
        |---jim
        |---bob
like image 162
Primusa Avatar answered Oct 19 '22 08:10

Primusa


Obviously you are trying to build a tree graph. If you do not know how many direct children a node can have, then the most common representation of a tree is as a collection of node objects. (If you know an upper limit on the number of children per node and most nodes have that many children, then you can represent the tree efficiently in a simple array.)

Each node has 1 parent and a set of children, which is represented by some kind of data container (usually an Object of class Node in an object-oriented language) that contains references or pointers to the parent and children. Typically this is a single variable for the parent reference and an array for the child references. The one node that has no parent is called the root and is stored in a special variable that is used to refer to the whole tree.

You want to arrange the tree so that it is easy to find a node given a name. Going over the various options for that can be a whole Computer Science course so I won't get into it here. You might, in fact, end up storing pointers to nodes in a second, sorted data structure to facilitate finding them quickly.

Then for each input, you find the referenced parent node and add the stated child to it. For example, when processing jill, john you

  • Create a new node named jill
  • Find the node named john
  • Add jill to john's child list
  • Set jill's parent to john

Does that answer your question?

like image 1
Old Pro Avatar answered Oct 19 '22 09:10

Old Pro