I have a case where I need to translate (lookup) several values from the same table. The first way I wrote it, was using subqueries:
SELECT
(SELECT id FROM user WHERE user_pk = created_by) AS creator,
(SELECT id FROM user WHERE user_pk = updated_by) AS updater,
(SELECT id FROM user WHERE user_pk = owned_by) AS owner,
[name]
FROM asset
As I'm using this subquery a lot (that is, I have about 50 tables with these fields), and I might need to add some more code to the subquery (for example, "AND active = 1" ) I thought I'd put these into a user-defined function UDF and use that. But the performance using that UDF was abysmal.
CREATE FUNCTION dbo.get_user ( @user_pk INT )
RETURNS INT
AS BEGIN
RETURN ( SELECT id
FROM ice.dbo.[user]
WHERE user_pk = @user_pk )
END
SELECT dbo.get_user(created_by) as creator, [name]
FROM asset
The performance of #1 is less than 1 second. Performance of #2 is about 30 seconds...
Why, or more importantly, is there any way I can code in SQL server 2008, so that I don't have to use so many subqueries?
Just a litte more explanation of when this is useful. This simple query (that is, get userid) gets a lot more complex when I want to have a text for a user, since I have to join with profile to get the language, with a company to see if the language should be fetch'ed from there instead, and with the translation table to get the translated text. And for most of these queries, performance is a secondary issue to readability and maintainability.
using function (included that subquery) has better performance, when you define a function, the function will not run while calling the function.
CTE can be more readable: Another advantage of CTE is CTE are more readable than Subqueries. Since CTE can be reusable, you can write less code using CTE than using subquery. Also, people tend to follow the logic and ideas easier in sequence than in a nested fashion.
A Sub-Query Does Not Hurt Performance.
It is necessary to execute the original SELECT here, without any pushed-down equalities of the kind mentioned previously. Without this conversion, subqueries are slow.
As other posters have suggested, using joins will definitely give you the best overall performance.
However, since you've stated that that you don't want the headache of maintaining 50-ish similar joins or subqueries, try using an inline table-valued function as follows:
CREATE FUNCTION dbo.get_user_inline (@user_pk INT)
RETURNS TABLE AS
RETURN
(
SELECT TOP 1 id
FROM ice.dbo.[user]
WHERE user_pk = @user_pk
-- AND active = 1
)
Your original query would then become something like:
SELECT
(SELECT TOP 1 id FROM dbo.get_user_inline(created_by)) AS creator,
(SELECT TOP 1 id FROM dbo.get_user_inline(updated_by)) AS updater,
(SELECT TOP 1 id FROM dbo.get_user_inline(owned_by)) AS owner,
[name]
FROM asset
An inline table-valued function should have better performance than either a scalar function or a multistatement table-valued function.
The performance should be roughly equivalent to your original query, but any future changes can be made in the UDF, making it much more maintainable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With