I'm having trouble writing an algo that can read a CSV of employee/managers, and output a directed graph featuring the employee/manager relationships.
My foobar example: given the following CSV file
john,
jill, john
tom, john
tim, jill
felisa, tom
ray, tom
bob, tim
jim, tim
pam, felisa
ben, ray
james, ray
mike, pam
rashad, ben
henry, james
How can I build a DiGraph such that the following organizational structure can be shown:
john
/ \
jill tom
/ / \
tim felisa ray
/ \ / / \
bob jim pam ben james
/ / \
mike rashad henry
Obviously this is a graph problem, but I'm having trouble deciding which structure(s) to use (e.g., would it be best to use a dict
or to build a custom OrganizationalGraph
object, etc). Any help is appreciated.
The language of choice isn't really important (though we can just say Python for simplicity [updated tags accordingly]), I'm more so just trying to understand the fundamentals of this type of problem (i.e., recursion vs. iteration, using set()
to store a manager's direct reports vs using only abstracted data structures). Finally, no, using any package outside of the standard library is a non-starter.
There are a variety of programs you can use to create your own org chart, like SmartDraw, Lucidchart and Microsoft, but Visme is the best program to use. With a variety of org chart templates and tools to choose from, it's easy to create a stunning, easy-to-follow organizational chart in minutes.
Microsoft Excel, PowerPoint, and Outlook all use the same SmartArt tool as Word and so you can use those programs to create org charts, as well. Outside of the Microsoft Office family, you can use Visio, LucidChart, OrgPlus, OrgWeaver, Pingboard, OrgChart4U, and others.
GitMind is a free organizational chart maker that offers an efficient way to create org charts. Choose from its library of templates to instantly build an org chart. Take advantage of unlimited access to a wide range of shapes and icons for customization.
To build the graph we are given the following information:
Note that in your question's example csv
data you seem to have mispelled felisa
as felia
. As a result these outputs aren't for the actual data you put up but rather the corrected version.
First we parse the csv
file, extracting the root and a list of edges:
import csv
with open('data.csv') as f:
f = list(csv.reader(f, skipinitialspace=True))
root, *edges = f
root = root[0]
print(root)
print(edges)
Output:
john
[['jill', 'john'], ['tom', 'john'], ['tim', 'jill'], ['felisa', 'tom'], ['ray', 'tom'], ['bob', 'tim'], ['jim', 'tim'], ['pam', 'felisa'], ['ben', 'ray'], ['james', 'ray'], ['mike', 'pam'], ['rashad', 'ben'], ['henry', 'james']]
We use a defaultdict
from collections
(standard library) to represent a graph. We use the key
in the dictionary to represent the parent / manager, and the value
to represent a list of children / employees:
from collections import defaultdict
graph = defaultdict(list)
for child, parent in edges:
graph[parent].append(child)
print(graph)
Output:
defaultdict(<class 'list'>, {'john': ['jill', 'tom'], 'jill': ['tim'], 'tom': ['felisa', 'ray'], 'tim': ['bob', 'jim'], 'felisa': ['pam'], 'ray': ['ben', 'james'], 'pam': ['mike'], 'ben': ['rashad'], 'james': ['henry']})
This structure lets us get a list of children of a node with graph[node]
. We know that the root of the tree is the node that isn't present in any of the values in any of the lists. We also saved the root earlier.
I took "how can I build a DiGraph such that the following organizational structure can be shown" quite literally. Here's an example of how we can traverse this graph structure to build a string representation:
res = ''
stack = [(root, 0)]
needed_lines = defaultdict(int)
while stack:
node, level = stack.pop()
prev_level = level-4
res += '\n' + ''.join('|' if i in needed_lines else
' ' if i <= level-4 else
'-' for i in range(level)) + node
for child in graph[node]:
stack.append((child, level+4))
needed_lines[level] += len(graph[node])
needed_lines[prev_level] -=1
if needed_lines[prev_level] == 0: del needed_lines[prev_level]
print(res)
Output:
john
|---tom
| |---ray
| | |---james
| | | |---henry
| | |---ben
| | |---rashad
| |---felisa
| |---pam
| |---mike
|---jill
|---tim
|---jim
|---bob
Obviously you are trying to build a tree graph. If you do not know how many direct children a node can have, then the most common representation of a tree is as a collection of node objects. (If you know an upper limit on the number of children per node and most nodes have that many children, then you can represent the tree efficiently in a simple array.)
Each node has 1 parent and a set of children, which is represented by some kind of data container (usually an Object of class Node in an object-oriented language) that contains references or pointers to the parent and children. Typically this is a single variable for the parent reference and an array for the child references. The one node that has no parent is called the root and is stored in a special variable that is used to refer to the whole tree.
You want to arrange the tree so that it is easy to find a node given a name. Going over the various options for that can be a whole Computer Science course so I won't get into it here. You might, in fact, end up storing pointers to nodes in a second, sorted data structure to facilitate finding them quickly.
Then for each input, you find the referenced parent node and add the stated child to it. For example, when processing jill, john
you
jill
john
jill
to john
's child listjill
's parent to john
Does that answer your question?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With