I have a python pandas dataframe df like this:
a b
1 3
3 6
5 7
6 4
7 8
I want to transfer it to a list:
[(1,3),(3,6),(5,7),(6,4),(7,8)]
Thanks.
By converting each row into a tuple and by appending the rows to a list, we can get the data in the list of tuple format. Example: Converting dataframe into a list of tuples. Convert rdd to a tuple using map () function, we are using map () and tuple () functions to convert from rdd
By using the Concept of to_numpy ().tolist method we can easily convert Pandas DataFrame into a list of 2d lists, by converting either each row or column. To do this first we have to create a list of tuples and then create a dataframe object ‘new_val’.
As you can see, the original DataFrame was indeed converted into a list (as highlighted in yellow): Let’s say that you’d like to convert the ‘Product’ column into a list.
A Data frame is a 2d data structure where data is in a tabular format of rows and columns. We can perform operations on rows and columns like selecting, deleting, adding, and renaming, it's easy to do data analysis by using dataFrame.
If performance is important, use a list comprehension:
[tuple(r) for r in df.to_numpy()]
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]
Note: For pandas < 0.24, please use df.values
instead.
You may find even better performance if you iterate over lists instead of the numpy array:
[tuple(r) for r in df.to_numpy().tolist()]
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]
This method to any number of columns. However, if you want to select a specific set of columns to convert, you can select them beforehand.
[tuple(r) for r in df[['a', 'b']].to_numpy()]
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]
Another alternative is using map
.
list(map(tuple, df.to_numpy()))
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]
This is roughly the same as the list comprehension, performance wise. You can generalise the same way.
Another option is to use apply
and convert the result to a list:
df.apply(tuple, axis=1).tolist()
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]
This is slower, so it not recommended.
Use zip()
to create tuples
df = pd.DataFrame({'a':[1,3,5,6,7], 'b':[3,6,7,4,8]})
print(list(zip(df['a'], df['b']))
You can also get the desired list like that:
zip(list(df['a']), list(df['b']))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With