What I am trying to do is select the 1st element of each cell regardless of the number of columns or rows (they may change based on user defined criteria) and make a new pandas dataframe from the data. My actual data structure is similar to what I have listed below.
0 1 2
0 [1, 2] [2, 3] [3, 6]
1 [4, 2] [1, 4] [4, 6]
2 [1, 2] [2, 3] [3, 6]
3 [4, 2] [1, 4] [4, 6]
I want the new dataframe to look like:
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4
The code below generates a data set similar to mine and attempts to do what I want to do in my code without success (d), and mimics what I have seen in a similar question with success(c ; however, only one column). The link to the similar, but different question is here :Python Pandas: selecting element in array column
import pandas as pd
zz = pd.DataFrame([[[1,2],[2,3],[3,6]],[[4,2],[1,4],[4,6]],
[[1,2],[2,3],[3,6]],[[4,2],[1,4],[4,6]]])
print(zz)
x= zz.dtypes
print(x)
a = pd.DataFrame((zz.columns.values))
b = pd.DataFrame.transpose(a)
c =zz[0].str[0] # this will give the 1st value for each cell in columns 0
d= zz[[b[0]].values].str[0] #attempt to get 1st value for each cell in all columns
You can use apply
and for selecting first value of list use indexing with str:
print (zz.apply(lambda x: x.str[0]))
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4
Another solution with stack
and unstack
:
print (zz.stack().str[0].unstack())
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4
I would use applymap
which applies the same function to each individual cell in your DataFrame
df.applymap(lambda x: x[0])
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With