Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In proc sql when using SELECT * and GROUP BY, the result is not collapsed

Tags:

sql

sas

When using the asterisk in combination with sum and group, the duplicates are not removed as I expect (and as it works in for example mysql):


col1 | country 
-----------------
5    | sweden  
20   | sweden  
30   | denmark 

select *, sum(col1) as s from table
group by country

the data returned is:

col1 | country | s
--------------------
5    | sweden  | 25
20   | sweden  | 25
30   | denmark | 30

instead of what I expected:

col1 | country | s
------------------------
5    | sweden  | 25
30   | denmark | 30

If I don't use asterisk (*), the data returned is as I expect it to be.

SELECT country, sum(col1) as s from table
like image 830
Hampus Brynolf Avatar asked Oct 24 '25 02:10

Hampus Brynolf


1 Answers

You are correct, SAS does not collapse WHEN you have variables in the statement that are not in the GROUP BY statement. There will be a note to that effect in the log, about your data being merged.

If you want just the variables, you'll have to list them unfortunately, but since you have to list them in GROUP BY it's not extra work per se.

Different SQL implementations handle things differently, this is one way that SAS is different. It's handy when you do want to merge a summary stat back with the main data set though.

If you don't want this behaviour add the NOREMERGE option to your PROC SQL - but it throws an error, it still doesn't work the way you want. See the documentation for the reference

like image 195
Reeza Avatar answered Oct 26 '25 16:10

Reeza



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!