I have a pandas dataframe with a column of lists.
df:
inputs
0 [1, 2, 3]
1 [4, 5, 6]
2 [7, 8, 9]
3 [10, 11, 12]
I need the matrix
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
An efficient way to do this?
Note: When I try df.inputs.as_matrix()
the output is
array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]], dtype=object)
which has shape (4,)
, not (4,3)
as desired.
You can convert the column to list and then apply numpy array, if all the lists in the column have the same length, this will make a 2D array:
arr = np.array(df.inputs.tolist())
#array([[ 1, 2, 3],
# [ 4, 5, 6],
# [ 7, 8, 9],
# [10, 11, 12]])
arr.shape
# (4, 3)
Or another option use .values
to access the numpy object firstly and then convert it to list as commented by @piRSquared, this is marginally faster with the example given:
%timeit df.inputs.values.tolist()
# 100000 loops, best of 3: 5.52 µs per loop
%timeit df.inputs.tolist()
# 100000 loops, best of 3: 11.5 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With