Some database engines like Microsoft Access supports FIRST()
as an aggregate function and I was using it in cases I know the column will only have one value in the group.
Potentially, the database engine can optimise this as if it reaches any value it can mark this value as already calculated. So it is a surprise why this is not supported in for example Oracle or SQL Server, and more importantly, not in the SQL standard.
In practice, people uses MIN()
or MAX()
instead, but they all require
The data type underneath have natural ordering semantic and the ordering does matter for the user;
The database engine have to compare the intermediate value with the values in each rows
So this is not optimal in many cases.
Are there any specific reasons people don't want to allow SELECT ANY(FIELD) ...
? (I could think of two variants: ANY()
gives any value in the result set that the column is not null; FIRST()
gives the column value for the first row in result set, or null if there is no rows)
Standard SQL has an aggregate function any but for a different use case. Again, what you (MS Access SQL) suggest for any gives you a non-deterministic result, which is not what standard SQL encourages.
Bookmark this question. Show activity on this post. Some database engines like Microsoft Access supports FIRST () as an aggregate function and I was using it in cases I know the column will only have one value in the group. Potentially, the database engine can optimise this as if it reaches any value it can mark this value as already calculated.
Standard SQL has an aggregate function any but for a different use case. Again, what you (MS Access SQL) suggest for any gives you a non-deterministic result, which is not what standard SQL encourages. The standard SQL function any returns a boolean that is true if any of the conditions is true. It is best used in having clauses:
Some database engines like Microsoft Access supports FIRST () as an aggregate function and I was using it in cases I know the column will only have one value in the group. Potentially, the database engine can optimise this as if it reaches any value it can mark this value as already calculated.
Regarding first
/last
The syntax supported by Microsoft Access SQL doesn't make sense in standard SQL:
SELECT
First(LastName) as First,
Last(LastName) as Last
FROM Employees
(source)
In Standard SQL, grouping takes places before sorting. Normally, groups are not sorted. That means, it is undefined which row comes first/last. Standard SQL generally aims to avoid constructs that have nondeterministic behaviour (exceptions exist).
Standard SQL offers to so-called ordered set functions that accept an within group (order by...)
clause to establish an order in a group prior to aggregation:
SELECT
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val)
FROM ...
The range for the argument of percentile_disc
is 0
to 1
whereas 0
is the first result and 1
the last. 0.5
is the median (this is the common use-case for percentile_disc
).
However, standard SQL does not offer first
/last
as ordered set function, but percentile_disc
with an argument of 0
is basically first, while the value 1
would basically give you the last result.
The more common SQL way to obtain the first/last value is to use a top-n query:
SELECT LastName
FROM Employees
ORDER BY ...
FETCH FIRST 1 ROW ONLY
Fetching first and last value in one go is a little bit awkward.
Other than that, standard SQL also offers the window functions first_value
and last_value
to pick those values out of a partition without grouping.
Regarding any
Standard SQL has an aggregate function any
but for a different use case. Again, what you (MS Access SQL) suggest for any gives you a non-deterministic result, which is not what standard SQL encourages.
The standard SQL function any
returns a boolean that is true if any of the conditions is true. It is best used in having
clauses:
SELECT
*
FROM ..
GROUP BY ...
HAVING ANY(<condition>)
This remove all groups where no <condition>
evaluates to true.
References:
WITHIN GROUP
: https://www.slideshare.net/MarkusWinand/modern-sql/105
every
function (which is similar to any
): https://blog.jooq.org/2014/12/18/a-true-sql-gem-you-didnt-know-yet-the-every-aggregate-function/
FIRST
(or better ANY_VALUE
as MySQL calls the function) is not in the SQL standard. This is probably because it is hardly needed in a standard-compliant DBMS.
You say you use FIRST
"in cases I know the column will only have one value in the Group". In a well-built database such a case should hardly ever occur. Maybe you are rather using it, because MS Access (and several other DBMS) violate the standard when it comes to aggregation with a GROUP BY
clause. An example:
select department_id, d.department_name, count(*) as num_employees
from employees e
join departments d using (department_id)
group by d.department_id;
You may want to use FIRST(d.department_name)
here, because you know that per department_id
there will be just one department_name
of course. But so does the DBMS (or better: it should!). In standard SQL the above query is valid, because the department_name
is functionally dependent on the department_id
. No need hence for a FIRST
or ANY_VALUE
function.
MySQL introduced ANY_VALUE
mainly in order to deal with cases where the DBMS fails to detect the functional dependency, but again these cases should be extremely rare. I like the function, because it gives you the opportunity to say "I don't care which", e.g. give me the departments and one leader per department (i.e. in case there are two department leaders: one of them arbitrarily chosen). But well, I guess in the standard comittee they decided that MIN
or MAX
would serve the purpose in such rare cases, too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With