Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tuple elements to dataframe column in python

I have 2D lists containing 0-3 sets of pairs (data will always be paired).

examples:

[[2.0, 0.1], [7.0, 0.6], [1.0, 0.3]] or
[[9.0, 0.7], [1.0, 0.2]]             or
[[]]

I want to be able to append each element of each pair to its own column in an existing dataframe.

Desired dataframe using above data:

other_data,    pair_0_0, pair_0_1, pair_1_0, pair_1_1, pair_2_0, pair2_1
'blah',        2.0,      0.1,      7.0,      0.6,      1.0,      0.3    
'blah blah',   9.0,      0.7,      1.0,      0.2
'blaah'       

It needs to be able to handle nulls, and preserve the order of the list.

I've tried the following, but it can't it gives an index error if i don't have 3 pairs.

df.loc[len(df)] = ['blah blah', list2D[0][0], list2D[0][1], list2D[1][0], list2D[1][1], list2D[2][0], list2D[2][1]

I think it would involve some list comprehension, but i'm not sure how to do it.

like image 973
Rory LM Avatar asked Dec 14 '25 04:12

Rory LM


1 Answers

How about numpy.ravel in a list comprehension:

l1 = [[2.0, 0.1], [7.0, 0.6], [1.0, 0.3]]
l2 = [[9.0, 0.7], [1.0, 0.2]]
l3 = [[]]

df = pd.DataFrame([np.ravel(x) for x in [l1, l2, l3]])

# Fix column headers
df.columns = [f'pair_{x//2}_{x%2}' for x in range(df.shape[1])]

[out]

   pair_0_0  pair_0_1  pair_1_0  pair_1_1  pair_2_0  pair_2_1
0       2.0       0.1       7.0       0.6       1.0       0.3
1       9.0       0.7       1.0       0.2       NaN       NaN
2       NaN       NaN       NaN       NaN       NaN       NaN

Update

To append an individual list to an existing DataFrame for example, use:

l4 = [[3.0, 0.2], [6.0, 0.8], [1.2, 0.6]]

df.append(pd.DataFrame([np.ravel(l4)]).rename(columns=lambda x: f'pair_{x//2}_{x%2}'))

[out]

   pair_0_0  pair_0_1  pair_1_0  pair_1_1  pair_2_0  pair_2_1
0       2.0       0.1       7.0       0.6       1.0       0.3
1       9.0       0.7       1.0       0.2       NaN       NaN
2       NaN       NaN       NaN       NaN       NaN       NaN
0       3.0       0.2       6.0       0.8       1.2       0.6

Or using pandas.concat in a loop to create a DataFrame from scratch you could do:

df = pd.DataFrame()

for l in  [l1, l2, l3]:
    df = pd.concat([df, pd.DataFrame([np.ravel(l)]).rename(columns=lambda x: f'pair_{x//2}_{x%2}')],
                   sort=True)
like image 192
Chris Adams Avatar answered Dec 15 '25 19:12

Chris Adams