I have got a dataframe like this: <pre class="prettyprint"><code> part part_parent 0 part1 NaN 1 part2 part1 2 part3 part2 3 part4 part3 4 part5 part2 </code></pre> I need to add an additional column hierarchy like this: <pre class="prettyprint"><code> part part_parent hierarchy 0 part1 NaN part1 1 part2 part1 part1/part2/ 2 part3 part2 part1/part2/part3/ 3 part4 part3 part1/part2/part3/part4 4 part5 part2 part1/part2/part5 </code></pre> Dict to create input/output dataframes: <pre class="prettyprint"><code>from numpy import nan df1 = pd.DataFrame({'part': {0: 'part1', 1: 'part2', 2: 'part3', 3: 'part4', 4: 'part5'}, 'part_parent': {0: nan, 1: 'part1', 2: 'part2', 3: 'part3', 4: 'part2'}}) df2 = pd.DataFrame({'part': {0: 'part1', 1: 'part2', 2: 'part3', 3: 'part4', 4: 'part5'}, 'part_parent': {0: nan, 1: 'part1', 2: 'part2', 3: 'part3', 4: 'part2'}, 'hierarchy': {0: 'part1', 1: 'part1/part2/', 2: 'part1/part2/part3/', 3: 'part1/part2/part3/part4', 4: 'part1/part2/part5'}}) </code></pre> NOTE: I've seen a couple of threads related to <code>NetworkX</code> to solve this issue but I'm not able to do so. Any help is appreciated.

Here is a recursive approach. It uses a Series that contains the parents for each element to find a given parent and walks back to the original parent until it finds NaN. At this point it returns the hierarchy. NB. This will not work if you have a circular network or undefined parents (the latter can easily be fixed is needed) <pre class="prettyprint"><code>import pandas as pd parents = df1.set_index('part')['part_parent'] def hierarchy(e): if not isinstance(e, list): return hierarchy([e]) parent = parents[e[0]] if pd.isna(parent): return '/'.join(e) return hierarchy([parent]+e) df2 = df1.copy() df2['hierarchy'] = df1['part'].apply(hierarchy) </code></pre>

Create hierarchy column in pandas

I have got a dataframe like this:

    part part_parent
0  part1         NaN
1  part2       part1
2  part3       part2
3  part4       part3
4  part5       part2

I need to add an additional column hierarchy like this:

    part part_parent                hierarchy
0  part1         NaN                    part1
1  part2       part1             part1/part2/
2  part3       part2       part1/part2/part3/
3  part4       part3  part1/part2/part3/part4
4  part5       part2        part1/part2/part5

Dict to create input/output dataframes:

from numpy import nan

df1 = pd.DataFrame({'part': {0: 'part1', 1: 'part2', 2: 'part3', 3: 'part4', 4: 'part5'},
 'part_parent': {0: nan, 1: 'part1', 2: 'part2', 3: 'part3', 4: 'part2'}})


df2 = pd.DataFrame({'part': {0: 'part1', 1: 'part2', 2: 'part3', 3: 'part4', 4: 'part5'},
 'part_parent': {0: nan, 1: 'part1', 2: 'part2', 3: 'part3', 4: 'part2'},
 'hierarchy': {0: 'part1',
  1: 'part1/part2/',
  2: 'part1/part2/part3/',
  3: 'part1/part2/part3/part4',
  4: 'part1/part2/part5'}})

NOTE: I've seen a couple of threads related to NetworkX to solve this issue but I'm not able to do so.

Any help is appreciated.

How do I create a hierarchical column in pandas?

To make the column an index, we use the Set_index() function of pandas. If we want to make one column an index, we can simply pass the name of the column as a string in set_index(). If we want to do multi-indexing or Hierarchical Indexing, we pass the list of column names in the set_index().

How do I create a MultiIndex column in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.

What is Panda hierarchical index?

Hierarchical indexing is one of the functions in pandas, a software library for the Python programming languages. pandas derives its name from the term “panel data”, a statistical term for four-dimensional data models that show changes over time.

How do I arrange my ascending order in pandas?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order.

Here is a solution using networkx. It treats nan as the root node, and finds the shortest path to each node based on that.

import networkx as nx

def find_path(net, source, target):
    # Adjust this as needed (in case multiple paths are present)
    # or error handling in case a path doesn't exist
    path = nx.shortest_path(net, source, target)
    return "/".join(list(path)[1:])

net = nx.from_pandas_edgelist(df1, "part", "part_parent")
df1["hierarchy"] = [find_path(net, nan, node) for node in df1["part"]]

    part part_parent                hierarchy
0  part1         NaN                    part1
1  part2       part1              part1/part2
2  part3       part2        part1/part2/part3
3  part4       part3  part1/part2/part3/part4
4  part5       part2        part1/part2/part5

The formatting of the path is contrived for this example, if more robust error-handling or multiple path formatting is needed, the path finder will have to be adjusted.

Here is a recursive approach. It uses a Series that contains the parents for each element to find a given parent and walks back to the original parent until it finds NaN. At this point it returns the hierarchy.

NB. This will not work if you have a circular network or undefined parents (the latter can easily be fixed is needed)

import pandas as pd

parents = df1.set_index('part')['part_parent']
def hierarchy(e):
    if not isinstance(e, list):
        return hierarchy([e])
    parent = parents[e[0]]
    if pd.isna(parent):
        return '/'.join(e)
    return hierarchy([parent]+e)

df2 = df1.copy()
df2['hierarchy'] = df1['part'].apply(hierarchy)

Create hierarchy column in pandas

Tags:

python

python-3.x

pandas

networkx

Shubham Sharma

People also ask

Video Answer

2 Answers

user3483203

mozway

Recent Activity

Donate For Us

Create hierarchy column in pandas

Tags:

python

python-3.x

pandas

networkx

Shubham Sharma

People also ask

Video Answer

2 Answers

user3483203

mozway

Related questions

Recent Activity

Donate For Us