I have a dataframe similar to this (but much larger).
>>> df = pd.DataFrame([ [ 'a', np.array([ 1, 2]) ], [ 'b', np.array([ 3, 4 ]) ] ])
0 1
0 a [1, 2]
1 b [3, 4]
The last column has the shape listed as...
>>> df[1].shape
(2,)
I'd like it to be listed as (2,2). I was able to do this via the following line, but the performance of tolist() is... bad.
>>> np.array(df[1].tolist()).shape
(2, 2)
It could also be a Pandas dataframe as long as it correctly reports the shape. Any other suggestions?
This is not possible!
Pandas keeps each Series as a single dimensional ndarray. If you have multiple dimensions that you are trying to squeeze into it, Pandas will force this to be a single dimensional array with dtype of object.
If you simply want to get the contents and make it into a 2 dimensional array then I'd suggest
np.array(df[1].values.tolist())
Otherwise, I'd suggest you keep them in two different columns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With