Create adjacency matrix for two columns in pandas dataframe

Tags:

I have a dataframe of the form:

index  Name_A  Name_B
  0    Adam    Ben
  1    Chris   David
  2    Adam    Chris
  3    Ben     Chris

And I'd like to obtain the adjacency matrix for Name_A and Name_B, ie:

      Adam Ben Chris David
Adam   0    1    1     0
Ben    0    0    1     0
Chris  0    0    0     1
David  0    0    0     0

What is the most pythonic/scaleable way of tackling this?

EDIT: Also, I know that if the row Adam, Ben is in the dataset, then at some other point, Ben, Adam will also be in the dataset.

535

asked Mar 15 '17 10:03

The Ref

1 Answers

You can use crosstab and then reindex by union of column and index values:

df = pd.crosstab(df.Name_A, df.Name_B)
print (df)
Name_B  Ben  Chris  David
Name_A                   
Adam      1      1      0
Ben       0      1      0
Chris     0      0      1

df = pd.crosstab(df.Name_A, df.Name_B)
idx = df.columns.union(df.index)
df = df.reindex(index = idx, columns=idx, fill_value=0)
print (df)
       Adam  Ben  Chris  David
Adam      0    1      1      0
Ben       0    0      1      0
Chris     0    0      0      1
David     0    0      0      0

167

answered Oct 02 '22 19:10

jezrael

Related questions
                            
                                Python - Conversion of list of arrays to 2D array
                            
                                How to iterate through a module's functions [duplicate]
                            
                                How to filter filter_horizontal in Django admin?
                            
                                whitespace in regular expression
                            
                                PDB: How to inspect local variables of functions in nested stack frames?
                            
                                matplotlib animation movie: quality of movie decreasing with time
                            
                                sklearn: use Pipeline in a RandomizedSearchCV?
                            
                                How to make two markers share the same label in the legend using matplotlib?
                            
                                Print exception with stack trace to file
                            
                                Error with Sklearn Random Forest Regressor
                            
                                Pandas Dataframe: How to update multiple columns by applying a function?
                            
                                How to find the shortest dependency path between two words in Python?
                            
                                'Graph' object has no attribute 'nodes_iter' in networkx module python
                            
                                How to make a ttk.Combobox callback
                            
                                Django: How to get related objects of a queryset?
                            
                                Get all comments from a specific reddit thread in python
                            
                                SqlAlchemy: How to implement DROP TABLE ... CASCADE?
                            
                                Error when using importlib.util to check for library
                            
                                Django loaddata UNIQUE constraint failed
                            
                                Python: nested 'for' loops

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create adjacency matrix for two columns in pandas dataframe

Tags:

python

pandas

dataframe

The Ref

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us