Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a dataframe to list of tuples [duplicate]

I have a python pandas dataframe df like this:

a  b
1  3
3  6
5  7
6  4
7  8

I want to transfer it to a list:

[(1,3),(3,6),(5,7),(6,4),(7,8)]

Thanks.

like image 825
kkjoe Avatar asked Jul 24 '17 16:07

kkjoe


People also ask

How to convert Dataframe to a list of tuples in Python?

By converting each row into a tuple and by appending the rows to a list, we can get the data in the list of tuple format. Example: Converting dataframe into a list of tuples. Convert rdd to a tuple using map () function, we are using map () and tuple () functions to convert from rdd

How to convert pandas Dataframe to a list of 2d lists?

By using the Concept of to_numpy ().tolist method we can easily convert Pandas DataFrame into a list of 2d lists, by converting either each row or column. To do this first we have to create a list of tuples and then create a dataframe object ‘new_val’.

Can a Dataframe be converted to a list?

As you can see, the original DataFrame was indeed converted into a list (as highlighted in yellow): Let’s say that you’d like to convert the ‘Product’ column into a list.

What is a Dataframe in Python?

A Data frame is a 2d data structure where data is in a tabular format of rows and columns. We can perform operations on rows and columns like selecting, deleting, adding, and renaming, it's easy to do data analysis by using dataFrame.


3 Answers

If performance is important, use a list comprehension:

[tuple(r) for r in df.to_numpy()]
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]

Note: For pandas < 0.24, please use df.values instead.

You may find even better performance if you iterate over lists instead of the numpy array:

[tuple(r) for r in df.to_numpy().tolist()]
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]

This method to any number of columns. However, if you want to select a specific set of columns to convert, you can select them beforehand.

[tuple(r) for r in df[['a', 'b']].to_numpy()]
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]

Another alternative is using map.

list(map(tuple, df.to_numpy()))
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]

This is roughly the same as the list comprehension, performance wise. You can generalise the same way.


Another option is to use apply and convert the result to a list:

df.apply(tuple, axis=1).tolist()
# [(1, 3), (3, 6), (5, 7), (6, 4), (7, 8)]

This is slower, so it not recommended.

like image 183
cs95 Avatar answered Oct 12 '22 16:10

cs95


Use zip() to create tuples

df = pd.DataFrame({'a':[1,3,5,6,7], 'b':[3,6,7,4,8]})
print(list(zip(df['a'], df['b']))
like image 42
ksai Avatar answered Oct 12 '22 17:10

ksai


You can also get the desired list like that:

zip(list(df['a']), list(df['b']))
like image 4
dvitsios Avatar answered Oct 12 '22 18:10

dvitsios