Extract co-occurrence data from dataframe

Question

I have something like this:

 fromJobtitle         toJobtitle         size
0              CEO                CEO    65
1              CEO     Vice President    23
2              CEO           Employee    56
3   Vice President                CEO   112
4         Employee                CEO    20

I would like to count number of co-occurences so that it combines the double occurences (showing only how many elements there are between the 2)

An example Output:

0              CEO     Vice President   135
1              CEO           Employee    76
2              CEO                CEO    65

Amin Ba · Accepted Answer

import pandas as pd
df = pd.DataFrame({
    'fromJobtitle': ['CEO', 'CEO', 'CEO', 'Vice President', 'Employee'],
    'toJobtitle': ['CEO', 'Vice President', 'Employee', 'CEO', 'CEO'],
    'size': [65, 23, 56, 112, 20]
    })

df['combination'] = df.apply(lambda row: tuple(sorted([
                                                       row['fromJobtitle'], 
                                                       row['toJobtitle']
                                                      ])), axis=1)

then:

df = df.groupby('combination').sum().reset_index()

results:

    combination             size
0   (CEO, CEO)              65
1   (CEO, Employee)         76
2   (CEO, Vice President)   135

finally:

df['from'] = df.apply(lambda row: row['combination'][0], axis=1)
df['to'] = df.apply(lambda row: row['combination'][1], axis=1)
df = df.drop('combination', axis=1)
df.head()

result:

    size    from    to
0   65      CEO     CEO
1   76      CEO     Employee
2   135     CEO     Vice President

Riccardo Bucco · Answer

Try:

df.groupby(lambda x: tuple(sorted(df.loc[x, ['fromJobTitle', 'toJobTitle']]))).sum()

Here is the result:

                       size
(CEO, CEO)               65
(CEO, Employee)          76
(CEO, Vice President)   135

Extract co-occurrence data from dataframe

Tags:

python

pandas

dataframe

find-occurrences

darklight213

2 Answers

Amin Ba

Riccardo Bucco

Recent Activity

Donate For Us

Extract co-occurrence data from dataframe

Tags:

python

pandas

dataframe

find-occurrences

darklight213

2 Answers

Amin Ba

Riccardo Bucco

Related questions

Recent Activity

Donate For Us