I have a dataframe like following
name foo_list
'foo' [('bleh'), ('blah')]
'bar' [(), 'boo']
'foobar' [(), (), ()]
I want to remove all the empty tuples and incase all the vals in list are empty tuples, just drop the row entirely. Also, I want to convert this list of tuples into list. So, output would be
name foo_list
'foo' ['bleh', 'blah']
'bar' [ 'boo']
How do i do this in pandas?
Try this ?
Data Input:
df=pd.DataFrame({'name':['A','B','C'],'foo_list':[[('bleh'),('blah')], [(), 'boo'],[(), (), ()]]})
Solution:
df['foo_list']=df['foo_list'].apply(lambda x : [t for t in x if t != ()])
df.loc[df['foo_list'].apply(len)>0,:]
Out[20]:
foo_list name
0 [bleh, blah] A
1 [boo] B
Timing(small size)
%timeit df['foo_list'].apply(lambda x : [t for t in x if t != ()])#Wen
10000 loops, best of 3: 117 µs per loop
%timeit df.foo_list.apply(lambda x: filter(None, x)) # John
10000 loops, best of 3: 121 µs per loop
large size will recommend John's solution
df=pd.concat([df]*10000,0)
%timeit df.foo_list.apply(lambda x: filter(None, x))
100 loops, best of 3: 10.2 ms per loop
%timeit df['foo_list'].apply(lambda x : [t for t in x if t != ()])
100 loops, best of 3: 17.1 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With