Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas select condition why to write dataframe name twice like frame[frame['col1'].notna()]?

I have more experience with SQL then with Python and now start to use Python more. I've read comparison with sql for pandas.

Groupby is clear to understand for me groupby('colname').

However why for select we need to write name of frame twice like in example frame[frame['col1'].notna()] I could not find a reason via web search.

like image 722
Alexei Martianov Avatar asked Mar 25 '26 23:03

Alexei Martianov


1 Answers

Just summarizing helpful comments:

This is called boolean masking/indexing, and is a way to select subsets of your data. It is a Python convention for numpy and pandas (which is built on numpy), pandas mask() function can be used to achieve the same result.

like image 107
Alexei Martianov Avatar answered Mar 28 '26 14:03

Alexei Martianov



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!