I have a dataframe with monthly records for different IDs, and I need to do some analysis only on the IDs that have multiple months of records.
ID    Month       Metric1    Metric2
1     2018-01-01  4          3      
1     2018-02-01  3          2
2     2018-02-01  1          5
3     2018-01-01  4          2
3     2018-02-01  6          3
4     2018-01-01  3          1
How would I filter out the rows of ID that only appear once and keep those with multiple rows and get a result like
ID    Month       Metric1    Metric2
1     2018-01-01  4          3      
1     2018-02-01  3          2
3     2018-01-01  4          2
3     2018-02-01  6          3
I've looked at some other pages that mention using something like:
df = df[df.groupby('ID').ID.transform(len) > 1]
But I don't want to lose the metrics from each month by grouping.
Change the len to count 
df[df.groupby('ID').ID.transform('count') > 1]
Out[589]: 
   ID       Month  Metric1  Metric2
0   1  2018-01-01        4        3
1   1  2018-02-01        3        2
3   3  2018-01-01        4        2
4   3  2018-02-01        6        3
                        Try with pd.series.duplicated():
df1=df[df.ID.duplicated(keep=False)]
print(df1)
   ID       Month  Metric1  Metric2
0   1  2018-01-01        4        3
1   1  2018-02-01        3        2
3   3  2018-01-01        4        2
4   3  2018-02-01        6        3
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With