I'm having trouble with filtering all but the last 1 element in each group of groupby object of pandas.DataFrame:
x = pd.DataFrame([['a', 1], ['b', 1], ['a', 2], ['b', 2], ['a', 3], ['b', 3]],
columns=['A', 'B'])
g = x.groupby('A')
As expected (according to documentation) g.head(1)
returns
A B
0 a 1
1 b 1
whereas g.head(-1)
returns empty DataFrame
From the behavior of x.head(-1)
I'd expect it to return
A B
0 a 1
1 b 1
2 a 2
3 b 2
i.e. dropping the last element of each group and then merging it back into the dataframe. If that's just the bug in pandas, I'd be grateful to anyone who suggests an alternative approach.
How to perform groupby index in pandas? Pass index name of the DataFrame as a parameter to groupby() function to group rows on an index. DataFrame. groupby() function takes string or list as a param to specify the group columns or index.
Pandas' groupby() allows us to split data into separate groups to perform computations for better analysis. In this article, you'll learn the “group by” process (split-apply-combine) and how to use Pandas's groupby() function to group data and perform operations.
DataFrame - tail() function The tail() function is used to get the last n rows. This function returns last n rows from the object based on position. It is useful for quickly verifying data, for example, after sorting or appending rows.
As commented these haven't (yet) been implemented in pandas. However, you can use cumcount to implement them efficiently:
def negative_head(g, n):
return g._selected_obj[g.cumcount(ascending=False) >= n]
def negative_tail(g, n):
return g._selected_obj[g.cumcount() >= n]
In [11]: negative_head(g, 1) # instead of g.head(-1)
Out[11]:
B
0 1
1 1
2 2
3 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With