We can add the different size of list values to DataFrame.
You can convert a dictionary to Pandas Dataframe using df = pd. DataFrame. from_dict(my_dict) statement.
We can create a DataFrame from dictionary using DataFrame. from_dict() function too i.e. It accepts a dictionary and orientation too. By default orientation is columns it means keys in dictionary will be used as columns while creating DataFrame.
pandas. DataFrame. from_dict() can be used to create a pandas DataFrame from Dict (Dictionary) object. This method takes parameters data , orient , dtype , columns and returns a DataFrame.
In Python 3.x:
import pandas as pd
import numpy as np
d = dict( A = np.array([1,2]), B = np.array([1,2,3,4]) )
pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in d.items() ]))
Out[7]:
A B
0 1 1
1 2 2
2 NaN 3
3 NaN 4
In Python 2.x:
replace d.items()
with d.iteritems()
.
Here's a simple way to do that:
In[20]: my_dict = dict( A = np.array([1,2]), B = np.array([1,2,3,4]) )
In[21]: df = pd.DataFrame.from_dict(my_dict, orient='index')
In[22]: df
Out[22]:
0 1 2 3
A 1 2 NaN NaN
B 1 2 3 4
In[23]: df.transpose()
Out[23]:
A B
0 1 1
1 2 2
2 NaN 3
3 NaN 4
A way of tidying up your syntax, but still do essentially the same thing as these other answers, is below:
>>> mydict = {'one': [1,2,3], 2: [4,5,6,7], 3: 8}
>>> dict_df = pd.DataFrame({ key:pd.Series(value) for key, value in mydict.items() })
>>> dict_df
one 2 3
0 1.0 4 8.0
1 2.0 5 NaN
2 3.0 6 NaN
3 NaN 7 NaN
A similar syntax exists for lists, too:
>>> mylist = [ [1,2,3], [4,5], 6 ]
>>> list_df = pd.DataFrame([ pd.Series(value) for value in mylist ])
>>> list_df
0 1 2
0 1.0 2.0 3.0
1 4.0 5.0 NaN
2 6.0 NaN NaN
Another syntax for lists is:
>>> mylist = [ [1,2,3], [4,5], 6 ]
>>> list_df = pd.DataFrame({ i:pd.Series(value) for i, value in enumerate(mylist) })
>>> list_df
0 1 2
0 1 4.0 6.0
1 2 5.0 NaN
2 3 NaN NaN
You may additionally have to transpose the result and/or change the column data types (float, integer, etc).
While this does not directly answer the OP's question. I found this to be an excellent solution for my case when I had unequal arrays and I'd like to share:
from pandas documentation
In [31]: d = {'one' : Series([1., 2., 3.], index=['a', 'b', 'c']),
....: 'two' : Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
....:
In [32]: df = DataFrame(d)
In [33]: df
Out[33]:
one two
a 1 1
b 2 2
c 3 3
d NaN 4
pandas.DataFrame
and pandas.concat
list
of DataFrames
with pandas.DataFrame
, from a dict
of uneven arrays
, and then concat
the arrays together in a list-comprehension.
DataFrame
of arrays
, that are not equal in length.arrays
, use df = pd.DataFrame({'x1': x1, 'x2': x2, 'x3': x3})
import pandas as pd
import numpy as np
# create the uneven arrays
mu, sigma = 200, 25
np.random.seed(365)
x1 = mu + sigma * np.random.randn(10, 1)
x2 = mu + sigma * np.random.randn(15, 1)
x3 = mu + sigma * np.random.randn(20, 1)
data = {'x1': x1, 'x2': x2, 'x3': x3}
# create the dataframe
df = pd.concat([pd.DataFrame(v, columns=[k]) for k, v in data.items()], axis=1)
pandas.DataFrame
and itertools.zip_longest
zip_longest
fills missing values with the fillvalue
.DataFrame
constructor won't unpack it.from itertools import zip_longest
# zip all the values together
zl = list(zip_longest(*data.values()))
# create dataframe
df = pd.DataFrame(zl, columns=data.keys())
df.plot(marker='o', figsize=[10, 5])
x1 x2 x3
0 232.06900 235.92577 173.19476
1 176.94349 209.26802 186.09590
2 194.18474 168.36006 194.36712
3 196.55705 238.79899 218.33316
4 249.25695 167.91326 191.62559
5 215.25377 214.85430 230.95119
6 232.68784 240.30358 196.72593
7 212.43409 201.15896 187.96484
8 188.97014 187.59007 164.78436
9 196.82937 252.67682 196.47132
10 NaN 223.32571 208.43823
11 NaN 209.50658 209.83761
12 NaN 215.27461 249.06087
13 NaN 210.52486 158.65781
14 NaN 193.53504 199.10456
15 NaN NaN 186.19700
16 NaN NaN 223.02479
17 NaN NaN 185.68525
18 NaN NaN 213.41414
19 NaN NaN 271.75376
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With