Say I have two dictionaries of the following form:
{'A':[1,2,3,4,5,6,7],
'B':[12,13,14,15,16,17,18} - Belongs to category "M"
{'A':[8,9,10,11,12,13,14],
'B':[18,19,20,21,22,23,24]} - Belongs to category "P"
Now the resulting data frame should be of the form --
Name . Value . Category
A . 1 . M
A . 8 . P
A . 10 . P
B . 12 . M
And so on. How can something like this be achieved?
Here is a more pandorable approach than what is suggested by user3483203. This avoids unnecessary iteration, is faster (for big enough datasets), and more idiomatic.
m = {'A':[1,2,3,4,5,6,7],
'B':[12,13,14,15,16,17,18]}
p = {'A':[8,9,10,11,12,13,14],
'B':[18,19,20,21,22,23,24]}
p_df = pd.DataFrame(p).melt(value_name='value')
m_df = pd.DataFrame(m).melt(value_name='value')
p_df['category'] = 'P'
m_df['category'] = 'M'
result = pd.concat([m_df, p_df], ignore_index=True)
m = {'A': list(range(0, 100_000)), 'B': list(range(100_000, 200_000))}
p = {'A': list(range(200_000, 300_000)), 'B': list(range(300_000, 400_000))}
Here we go:
%%timeit
p_df = pd.DataFrame(p).melt(value_name='value')
m_df = pd.DataFrame(m).melt(value_name='value')
p_df['category'] = 'P'
m_df['category'] = 'M'
result = pd.concat([m_df, p_df], ignore_index=True)
120 ms ± 3.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
categories = ['M', 'P']
dcts = [m, p]
dfs = [
pd.DataFrame([[k, el, cat] for k, v in dct.items() for el in v])
for dct, cat in zip(dcts, categories)
]
cols = {'columns': {0: 'Name', 1: 'Value', 2: 'Category'}}
result = pd.concat(dfs).reset_index(drop=True).rename(**cols)
207 ms ± 8.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With