I have a Python pandas DataFrame rpt
:
rpt <class 'pandas.core.frame.DataFrame'> MultiIndex: 47518 entries, ('000002', '20120331') to ('603366', '20091231') Data columns: STK_ID 47518 non-null values STK_Name 47518 non-null values RPT_Date 47518 non-null values sales 47518 non-null values
I can filter the rows whose stock id is '600809'
like this: rpt[rpt['STK_ID'] == '600809']
<class 'pandas.core.frame.DataFrame'> MultiIndex: 25 entries, ('600809', '20120331') to ('600809', '20060331') Data columns: STK_ID 25 non-null values STK_Name 25 non-null values RPT_Date 25 non-null values sales 25 non-null values
and I want to get all the rows of some stocks together, such as ['600809','600141','600329']
. That means I want a syntax like this:
stk_list = ['600809','600141','600329'] rst = rpt[rpt['STK_ID'] in stk_list] # this does not works in pandas
Since pandas not accept above command, how to achieve the target?
The pandas. DataFrame. duplicated() method is used to find duplicate rows in a DataFrame. It returns a boolean series which identifies whether a row is duplicate or unique.
Use DataFrame. drop_duplicates() to Drop Duplicate and Keep First Rows. You can use DataFrame. drop_duplicates() without any arguments to drop rows with the same values on all columns.
Use the isin
method:
rpt[rpt['STK_ID'].isin(stk_list)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With