I have a df like so:
import pandas a=[['1/2/2014', 'a', '6', 'z1'], ['1/2/2014', 'a', '3', 'z1'], ['1/3/2014', 'c', '1', 'x3'], ] df = pandas.DataFrame.from_records(a[1:],columns=a[0])
I want to flatten the df so it is one continuous list like so:
['1/2/2014', 'a', '6', 'z1', '1/2/2014', 'a', '3', 'z1','1/3/2014', 'c', '1', 'x3']
I can loop through the rows and extend
to a list, but is a much easier way to do it?
The first method to flatten the pandas dataframe is through NumPy python package. There is a function in NumPy that is numpy. flatten() that perform this task. First, you have to convert the dataframe to numpy using the to_numpy() method and then apply the flatten() method.
Return a copy of the array collapsed into one dimension. Whether to flatten in C (row-major), Fortran (column-major) order, or preserve the C/Fortran ordering from a . The default is 'C'.
Use as flatten_col(input, 'B', 'B') in your example. The benefit of this method is that copies along all other columns as well (unlike some other solutions).
You can use .flatten()
on the DataFrame converted to a NumPy array:
df.to_numpy().flatten()
and you can also add .tolist()
if you want the result to be a Python list
.
In previous versions of Pandas, the values
attributed was used instead of the .to_numpy()
method, as mentioned in the comments below.
Maybe use stack?
df.stack().values array(['1/2/2014', 'a', '3', 'z1', '1/3/2014', 'c', '1', 'x3'], dtype=object)
(Edit: Incidentally, the DF in the Q uses the first row as labels, which is why they're not in the output here.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With