I new to Python and I'm therefore having trouble converting a row in a DataFrame
into a flat list
. To do this I use the following code:
Toy DataFrame
:
import pandas as pd d = { "a": [1, 2, 3, 4, 5], "b": [9, 8, 7, 6, 5], "n": ["a", "b", "c", "d", "e"] } df = pd.DataFrame(d)
My code:
df_note = df.loc[df.n == "d", ["a", "b"]].values #convert to array df_note = df_note.tolist() #convert to nested list df_note = reduce(lambda x, y: x + y, df_note) #convert to flat list
To me this code appears to be both gross and inefficient. The fact that I convert to an array
before a list
is what is causing the problem, i.e. the list
to be nested. That withstanding, I can not find a means of converting the row directly to a list. Any advice?
This question is not a dupe of this. In my case, I want the list to be flat.
The command to convert Dataframe to list is pd. DataFrame. values. tolist().
# Converting dataframe into a list. List = dataFrame. values. tolist()
To convert Pandas DataFrame to List in Python, use the DataFrame. values(). tolist() function.
A general solution (less specific to the example) is: df.loc [index, :].values.flatten ().tolist () where index is the index of the pandas Dataframe row you want to convert. You get a nested list because you select a sub data frame. This takes a row, which can be converted to a list without flattening: The values are stored in an NumPy array.
As you can see based on the RStudio console output, our data frame contains five rows and three columns. The row names are numerated from row1 to row5. If we want to convert the rows of this data frame into list elements, we can use a combination of the split, seq, and nrow functions. Consider the following R code:
As you can see, the original DataFrame was indeed converted into a list (as highlighted in yellow): Let’s say that you’d like to convert the ‘Product’ column into a list.
Here we are taking separate lists as input such that each list will act as one column, so the number of lists = n columns in the dataframe, and using zip function we are combining the lists.
You are almost there, actually just use flatten
instead of reduce
to unnest the array (instead of unnesting the list), and chain operations to have a one liner:
df.loc[df.n == "d", ['a','b']].values.flatten().tolist() #[4, 6]
You get a nested list because you select a sub data frame.
This takes a row, which can be converted to a list without flattening:
df.loc[0, :].values.tolist() [1, 9, 'a']
How about slicing the list:
df_note.values.tolist()[0] [4, 6]
The values are stored in an NumPy array. So you do not convert them. Pandas uses a lot of NumPy under the hood. The attribute access df_note.values
is just a different name for part of the data frame.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With