type(Table)
pandas.core.frame.DataFrame
Table
======= ======= =======
Column1 Column2 Column3
0 23 1
1 5 2
1 2 3
1 19 5
2 56 1
2 22 2
3 2 4
3 14 5
4 59 1
5 44 1
5 1 2
5 87 3
For anyone familliar with pandas how would I build a multivalue dictionary with the .groupby()
method?
I would like an output to resemble this format:
{
0: [(23,1)]
1: [(5, 2), (2, 3), (19, 5)]
# etc...
}
where Col1
values are represented as keys and the corresponding Col2
and Col3
are tuples packed into an array for each Col1
key.
My syntax works for pooling only one column into the .groupby()
:
Table.groupby('Column1')['Column2'].apply(list).to_dict()
# Result as expected
{
0: [23],
1: [5, 2, 19],
2: [56, 22],
3: [2, 14],
4: [59],
5: [44, 1, 87]
}
However specifying multiple values for the indices results in returning column names for the value :
Table.groupby('Column1')[('Column2', 'Column3')].apply(list).to_dict()
# Result has column namespace as array value
{
0: ['Column2', 'Column3'],
1: ['Column2', 'Column3'],
2: ['Column2', 'Column3'],
3: ['Column2', 'Column3'],
4: ['Column2', 'Column3'],
5: ['Column2', 'Column3']
}
How would I return a list of tuples in the value array?
groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. sort : Sort group keys.
When as_index=True the key(s) you use in groupby() will become an index in the new dataframe. The benefits you get when you set the column as index are: Speed. When you filter values based on the index column eg.
What is the GroupBy function? Pandas' GroupBy is a powerful and versatile function in Python. It allows you to split your data into separate groups to perform computations for better analysis.
Customize the function you use in apply
so it returns a list of lists for each group:
df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: g.values.tolist()).to_dict()
# {0: [[23, 1]],
# 1: [[5, 2], [2, 3], [19, 5]],
# 2: [[56, 1], [22, 2]],
# 3: [[2, 4], [14, 5]],
# 4: [[59, 1]],
# 5: [[44, 1], [1, 2], [87, 3]]}
If you need a list of tuples explicitly, use list(map(tuple, ...))
to convert:
df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()
# {0: [(23, 1)],
# 1: [(5, 2), (2, 3), (19, 5)],
# 2: [(56, 1), (22, 2)],
# 3: [(2, 4), (14, 5)],
# 4: [(59, 1)],
# 5: [(44, 1), (1, 2), (87, 3)]}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With