Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boolean matrix form Python's dict of lists

I have a dict of lists e.g.,

dictionary_test = {'A': ['hello', 'byebye', 'howdy'], 'B': ['bonjour', 'hello', 'ciao'], 'C': ['ciao', 'hello', 'byebye']}

I want to convert it into a boolean affiliation matrix for further analysis. Preferably, dict keys as column names, and list items as row names:

         A    B    C
  hello  1    1    1
 byebye  1    0    1
  howdy  1    0    0
bonjour  0    1    0
   ciao  0    1    1

Is it possible to do in Python (preferably so that I could write the matrix to a .csv file)? I would image this is something I would have to do with numpy, correct?

An additional problem is that the size of the dictionary is unknown (both the number of keys and the number of elements in lists vary).

like image 925
Zlo Avatar asked Feb 06 '23 04:02

Zlo


1 Answers

You can use pandas. Here is an example.

>>> import pandas as pd
>>> dictionary_test = {'A': ['hello', 'byebye', 'howdy'], 'B': ['bonjour', 'hello', 'ciao'], 'C': ['ciao', 'hello', 'byebye']}
>>> values = list(set([ x for y in dictionary_test.values() for x in y]))
>>> data = {}
>>> for key in dictionary_test.keys():
...  data[key] = [ True if value in dictionary_test[key] else False for value in values ]
... 
>>> pd.DataFrame(data, index=values)
             A      B      C
ciao     False   True   True
howdy     True  False  False
bonjour  False   True  False
hello     True   True   True
byebye    True  False   True

If you want the rows in certain order. Just manually set values.

like image 155
greedy52 Avatar answered Feb 07 '23 16:02

greedy52