I have the following large dataframe (df) that looks like this:
    ID     date        PRICE        1   10001  19920103  14.500     2   10001  19920106  14.500     3   10001  19920107  14.500      4   10002  19920108  15.125      5   10002  19920109  14.500    6   10002  19920110  14.500     7   10003  19920113  14.500  8   10003  19920114  14.500      9   10003  19920115  15.000    Question: What's the most efficient way to delete (or remove) the first row of each ID? I want this:
        ID     date     PRICE            2   10001  19920106  14.500         3   10001  19920107  14.500          5   10002  19920109  14.500        6   10002  19920110  14.500         8   10003  19920114  14.500          9   10003  19920115  15.000    I can do a loop over each unique ID and remove the first row but I believe this is not very efficient.
Use iloc to drop first row of pandas dataframe. Use drop() to remove first row of pandas dataframe. Use tail() function to remove first row of pandas dataframe.
Remove First N Rows of Pandas DataFrame Using tail()tail(df. shape[0] -n) to remove the top/first n rows of pandas DataFrame. Generally, DataFrame. tail() function is used to show the last n rows of a pandas DataFrame but you can pass a negative value to skip the rows from the beginning.
To delete the first three rows of a DataFrame in Pandas, we can use the iloc() method.
Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).
Another one line code is df.groupby('ID').apply(lambda group: group.iloc[1:, 1:])
Out[100]:               date  PRICE ID                       10001 2  19920106   14.5       3  19920107   14.5 10002 5  19920109   14.5       6  19920110   14.5 10003 8  19920114   14.5       9  19920115   15.0 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With