Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python how to combine two columns of a dataframe into a single list?

I have a dataframe as given below

df = 

index    data1    data2
0         20       120
1         30       456
2         40       34

How to combine two columns in above df into a single list such that first row elements come first and then second row.

My expected output

my_list = [20,120,30,456,40,34]

My code:

list1 = df['data1'].tolist()
list2 = df['data2'].tolist()

my_list = list1+list2

This did not work?

like image 738
Mainland Avatar asked Jan 31 '20 03:01

Mainland


2 Answers

The underlying numpy array is organized array([[row1], [row2], ..., [rowN]]) so we can ravel it, which should be very fast.

df[['data1', 'data2']].to_numpy().ravel().tolist()
#[20, 120, 30, 456, 40, 34]

Because I was interested: Here are all the proposed methods, plus another with chain, and some timings for making your output from 2 columns vs the length of the DataFrame.

import perfplot
import pandas as pd
import numpy as np
from itertools import chain

perfplot.show(
    setup=lambda n: pd.DataFrame(np.random.randint(1, 10, (n, 2))), 
    kernels=[
        lambda df: df[[0, 1]].to_numpy().ravel().tolist(),
        lambda df: [x for i in zip(df[0], df[1]) for x in i],
        lambda df: [*chain.from_iterable(df[[0,1]].to_numpy())],
        lambda df: df[[0,1]].stack().tolist()  #  proposed by @anky_91
    ],
    labels=['ravel', 'zip', 'chain', 'stack'],
    n_range=[2 ** k for k in range(20)],
    equality_check=np.allclose,  
    xlabel="len(df)"
)

enter image description here

like image 53
ALollz Avatar answered Sep 28 '22 02:09

ALollz


That doesn't work since it won't add by same index, use the below list comprehension:

print([x for i in zip(df['data1'], df['data2']) for x in i])
like image 35
U12-Forward Avatar answered Sep 28 '22 04:09

U12-Forward