I have a DataFrame
as follows:
In [23]: df = pandas.DataFrame({'Initial': ['C','A','M'], 'Sex': ['M', 'F', 'F'], 'Age': [49, 39, 19]})
df = df[['Initial', 'Sex', 'Age']]
df
Out[23]:
Initial Sex Age
0 C M 49
1 A F 39
2 M F 19
My goal is to create a dict like this:
{'C': ('49', 'M'), 'A': ('39', 'F'), 'M': ('19', 'F')}
Currently, I'm doing it like this:
In [24]: members = df.set_index('FirstName', drop=True).to_dict('index')
members
Out[24]: {'C': {'Age': '49', 'Sex': 'M'}, 'A': {'Age': '39', 'Sex': 'F'}, 'M': {'Age': '19', 'Sex': 'F'}}
Then I use a dict
comprehrension to format the values of the keys as tuples instead of dicts:
In [24]: members= {x: tuple(y.values()) for x, y in members.items()}
members
Out[24]: {'C': ('49', 'M'), 'A': ('39', 'F'), 'M': ('19', 'F')}
My question is: is there a way to get a dict
in the format I want from a pandas DataFrame
without incurring the additional overheard of the dict
comprehension?
Using Tuples as Keys in Dictionaries. Because tuples are hashable and lists are not, if we want to create a composite key to use in a dictionary we must use a tuple as the key. Write code to create a dictionary called 'd1', and in it give the tuple (1, 'a') a value of “tuple”.
In Python, use the dict() function to convert a tuple to a dictionary. A dictionary object can be created with the dict() function. The dictionary is returned by the dict() method, which takes a tuple of tuples as an argument. A key-value pair is contained in each tuple.
To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}} . When no orient is specified, to_dict() returns in this format.
The pandas DataFrame constructor will create a pandas DataFrame object using a python list of tuples. We need to send this list of tuples as a parameter to the pandas. DataFrame() function.
This should work:
df.set_index('Initial')[['Age', 'Sex']].T.apply(tuple).to_dict()
{'A': (39, 'F'), 'C': (49, 'M'), 'M': (19, 'F')}
If lists instead of tuples are okay, then you could use:
In [45]: df.set_index('Initial')[['Age','Sex']].T.to_dict('list')
Out[45]: {'A': [39, 'F'], 'C': [49, 'M'], 'M': [19, 'F']}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With