Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting an element of a list in a pandas column

I have a DataFrame that contains a list on each column as shown in the example below with only two columns.

    Gamma   Beta
0   [1.4652917656926299, 0.9326935235505321, float] [91, 48.611034768515864, int]
1   [2.6008354611105995, 0.7608529935313189, float] [59, 42.38646954167245, int]
2   [2.6386970166722348, 0.9785848171888037, float] [89, 37.9011122659478, int]
3   [3.49336632573625, 1.0411524946972244, float]   [115, 36.211134224288344, int]
4   [2.193991200007534, 0.7955134305428825, float]  [128, 50.03563864975485, int]
5   [3.4574527664490997, 0.9399880977511021, float] [120, 41.841146628802875, int]
6   [3.1190582380554863, 1.0839109431114795, float] [148, 55.990072419824514, int]
7   [2.7757359940789916, 0.8889801332053203, float] [142, 51.08885697101243, int]
8   [3.23820908493237, 1.0587479742892683, float]   [183, 43.831293356668425, int]
9   [2.2509032790941985, 0.8896196407231622, float] [66, 35.9377662201882, int]

I'd like to extract for every column the first position of the list on each row to get a DataFrame looking as follows.

    Gamma   Beta
0   1.4652917656926299  91
1   2.6008354611105995  59
2   2.6386970166722348  89
...

Up to now, my solution would be like [row[1][0] for row in df_params.itertuples()], which I could iterate for every column index of the row and then compose my new DataFrame.

An alternative is new_df = df_params['Gamma'].apply(lambda x: x[0]) and then to iterate to go through all the columns.

My question is, is there a less cumbersome way to perform this operation?

like image 615
Ignacio Vergara Kausel Avatar asked Aug 31 '17 13:08

Ignacio Vergara Kausel


People also ask

How do I extract a column from a list in pandas?

tolist() you can convert pandas DataFrame Column to List. df['Courses'] returns the DataFrame column as a Series and then use values. tolist() to convert the column values to list.

How do I extract a specific value from a DataFrame pandas?

get_value() function is used to quickly retrieve the single value in the data frame at the passed column and index. The input to the function is the row label and the column label.

How do you extract an element from a series?

Use the index operator [ ] to access an element in a series. The index must be an integer. In order to access multiple elements from a series, we use Slice operation. Slice operation is performed on Series with the use of the colon(:).


2 Answers

You can use the str accessor for lists, e.g.:

df_params['Gamma'].str[0]

This should work for all columns:

df_params.apply(lambda col: col.str[0])
like image 180
IanS Avatar answered Sep 22 '22 01:09

IanS


Itertuples would be pretty slow. You could speed this up with the following:

for column_name in df_params.columns:
    df_params[column_name] = [i[0] for i in df_params[column_name]]
like image 28
A.Kot Avatar answered Sep 20 '22 01:09

A.Kot