I've seen a pandasql
query like this:
df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]})
sqldf('select * from df group by A', locals())
This gives:
A B
0 1 3
1 2 6
I find it really weird to have a group by without an aggregate function, but can anyone tell me which function is used on the aggregated columns to reduce multiple values into one?
GROUP BY without Aggregate Functions Although most of the times GROUP BY is used along with aggregate functions, it can still still used without aggregate functions — to find unique records.
The GROUP BY clause is normally used along with five built-in, or "aggregate" functions. These functions perform special operations on an entire table or on a set, or group, of rows rather than on each row and then return one row of values for each group.
Instead of using groupby aggregation together, we can perform groupby without aggregation which is applicable to aggregate data separately.
A query with a having clause should also have a group by clause. If you omit group by, all the rows not excluded by the where clause return as a single group. Because no grouping is performed between the where and having clauses, they cannot act independently of each other.
It looks like the groupby method you're looking for is last()
:
df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]})
df.groupby('A', as_index=False).last()
Output:
A B
0 1 3
1 2 5
I'm saying this assuming the 5 was a typo (see my comment above) and meant to be 6.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With