So I have a list as follows:
aa = ['aa1', 'aa2', 'aa3', 'aa4', 'aa5']
bb = ['bb1', 'bb2', 'bb3', 'bb4']
cc = ['cc1', 'cc2', 'cc3']
Which is then created into a nested list:
nest = [aa, bb, cc]
I want to create a dataframe as follows:
aa bb cc
aa1 bb1 cc1
aa2 bb2 cc2
aa3 bb3 cc3
aa4 bb4 nan
aa5 nan nan
I've tried:
pd.DataFrame(nest, columns=['aa', 'bb', cc'])
But results is such that, each list is being written as a row (as opposed to a column)
The zip_longest
function from itertools
does this:
>>> import itertools, pandas
>>> pandas.DataFrame((_ for _ in itertools.zip_longest(*nest)), columns=['aa', 'bb', 'cc'])
aa bb cc
0 aa1 bb1 cc1
1 aa2 bb2 cc2
2 aa3 bb3 cc3
3 aa4 bb4 None
4 aa5 None None
If you have an older version of pandas you may need to wrap zip_longest
in a list constructor. On older Python you may need to call izip_longest
instead of zip_longest
.
Option 1
pd.DataFrame(nest, ['aa', 'bb', 'cc']).T
aa bb cc
0 aa1 bb1 cc1
1 aa2 bb2 cc2
2 aa3 bb3 cc3
3 aa4 bb4 None
4 aa5 None None
Option 2
Homebrew zip_longest
f = lambda x, n: x[n] if n < len(x) else None
n, m = max(map(len, nest)), len(nest)
pd.DataFrame(
[[f(j, i) for j in nest] for i in range(n)],
columns=['aa', 'bb', 'cc']
)
aa bb cc
0 aa1 bb1 cc1
1 aa2 bb2 cc2
2 aa3 bb3 cc3
3 aa4 bb4 None
4 aa5 None None
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With