Suppose I have a dataframe:
col1 col2 col3
0 1 5 2
1 7 13
2 9 1
3 7
How do I convert to a single list such as:
[1, 7, 9, 5, 13, 1, 7]
I have tried:
df.values.tolist()
However this returns a list of lists rather than a single list:
[[1.0, 5.0, 2.0], [7.0, 13.0, nan], [9.0, 1.0, nan], [nan, 7.0, nan]]
Note the dataframe will contain an unknown number of columns. The order of the values is not important so long as the list contains all values in the dataframe.
I imagine I could write a function to unpack the values, however I'm wondering if there is a simple built-in way of converting a dataframe to a series/list?
Following your current approach, you can flatten your array before converting it to a list. If you need to drop nan
values, you can do that after flattening as well:
arr = df.to_numpy().flatten()
list(arr[~np.isnan(arr)])
Also, future versions of Pandas seem to prefer to_numpy
over values
An alternate, perhaps cleaner, approach is to 'stack' your dataframe:
df.stack().tolist()
you can use dataframe stack
In [12]: df = pd.DataFrame({"col1":[np.nan,3,4,np.nan], "col2":['test',np.nan,45,3]})
In [13]: df.stack().tolist()
Out[13]: ['test', 3.0, 4.0, 45, 3]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With