Is it possible to do a case-insensitive DISTINCT with SAS (PROC SQL)?

Question

Is there a way to get the case-insensitive distinct rows from this SAS SQL query? ...

SELECT DISTINCT country FROM companies;

The ideal solution would consist of a single query.

Results now look like:

Australia
australia
AUSTRALIA
Hong Kong
HONG KONG

... where any of the 2 distinct rows is really required

One could upper-case the data, but this unnecessarily changes values in a manner that doesn't suit the purpose of this query.

Roee Adler · Accepted Answer

If you have some primary int key (let's call it ID), you could use:

SELECT country FROM companies
WHERE id =
(
    SELECT Min(id) FROM companies
    GROUP BY Upper(country)
)

Alex Martelli · Answer

Normalizing case does seem advisable -- if 'Australia', 'australia' and 'AUSTRALIA' all occur, which one of the three would you want as the "case-sensitively unique" answer to your query, after all? If you're keen on some specific heuristics (e.g. count how many times they occur and pick the most popular), this can surely be done but might be a huge amount of extra work -- so, how much is such persnicketiness worth to you?

Simon Nickerson · Answer

A non-SQL method (really only a single step as the data step just creates a view) would be:


data companies_v /view=companies_v;
  set companies (keep=country);
  _upcase_country = upcase(country);
run;

proc sort data=companies_v out=companies_distinct_countries (drop=_upcase_country) nodupkey noequals;
  by _upcase_country;
run;

Is it possible to do a case-insensitive DISTINCT with SAS (PROC SQL)?

Tags:

sql

sas

proc-sql

Rog

3 Answers

Roee Adler

Alex Martelli

Simon Nickerson

Recent Activity

Donate For Us

Is it possible to do a case-insensitive DISTINCT with SAS (PROC SQL)?

Tags:

sql

sas

proc-sql

Rog

3 Answers

Roee Adler

Alex Martelli

Simon Nickerson

Related questions

Recent Activity

Donate For Us