I have a DataFrame that contains a list on each column as shown in the example below with only two columns.
Gamma Beta
0 [1.4652917656926299, 0.9326935235505321, float] [91, 48.611034768515864, int]
1 [2.6008354611105995, 0.7608529935313189, float] [59, 42.38646954167245, int]
2 [2.6386970166722348, 0.9785848171888037, float] [89, 37.9011122659478, int]
3 [3.49336632573625, 1.0411524946972244, float] [115, 36.211134224288344, int]
4 [2.193991200007534, 0.7955134305428825, float] [128, 50.03563864975485, int]
5 [3.4574527664490997, 0.9399880977511021, float] [120, 41.841146628802875, int]
6 [3.1190582380554863, 1.0839109431114795, float] [148, 55.990072419824514, int]
7 [2.7757359940789916, 0.8889801332053203, float] [142, 51.08885697101243, int]
8 [3.23820908493237, 1.0587479742892683, float] [183, 43.831293356668425, int]
9 [2.2509032790941985, 0.8896196407231622, float] [66, 35.9377662201882, int]
I'd like to extract for every column the first position of the list on each row to get a DataFrame looking as follows.
Gamma Beta
0 1.4652917656926299 91
1 2.6008354611105995 59
2 2.6386970166722348 89
...
Up to now, my solution would be like [row[1][0] for row in df_params.itertuples()]
, which I could iterate for every column index of the row and then compose my new DataFrame.
An alternative is new_df = df_params['Gamma'].apply(lambda x: x[0])
and then to iterate to go through all the columns.
My question is, is there a less cumbersome way to perform this operation?
tolist() you can convert pandas DataFrame Column to List. df['Courses'] returns the DataFrame column as a Series and then use values. tolist() to convert the column values to list.
get_value() function is used to quickly retrieve the single value in the data frame at the passed column and index. The input to the function is the row label and the column label.
Use the index operator [ ] to access an element in a series. The index must be an integer. In order to access multiple elements from a series, we use Slice operation. Slice operation is performed on Series with the use of the colon(:).
You can use the str
accessor for lists, e.g.:
df_params['Gamma'].str[0]
This should work for all columns:
df_params.apply(lambda col: col.str[0])
Itertuples would be pretty slow. You could speed this up with the following:
for column_name in df_params.columns:
df_params[column_name] = [i[0] for i in df_params[column_name]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With