I have a pandas dataframe with a column of lists.
df:
    inputs
0   [1, 2, 3]
1   [4, 5, 6]
2   [7, 8, 9]
3   [10, 11, 12]
I need the matrix
array([[ 1,  2,  3],
      [ 4,  5,  6],
      [ 7,  8,  9],
      [10, 11, 12]])
An efficient way to do this?
Note: When I try df.inputs.as_matrix() the output is
array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]], dtype=object)
which has shape (4,), not (4,3) as desired.
You can convert the column to list and then apply numpy array, if all the lists in the column have the same length, this will make a 2D array:
arr = np.array(df.inputs.tolist())
#array([[ 1,  2,  3],
#       [ 4,  5,  6],
#       [ 7,  8,  9],
#       [10, 11, 12]])
arr.shape
# (4, 3)
Or another option use .values to access the numpy object firstly and then convert it to list as commented by @piRSquared, this is marginally faster with the example given:
%timeit df.inputs.values.tolist()
# 100000 loops, best of 3: 5.52 µs per loop
%timeit df.inputs.tolist()
# 100000 loops, best of 3: 11.5 µs per loop
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With