If a statement return rows doing a simple select over the data base, is there performance difference between implement it using Function and Procedures? I know it is preferable to do it using function, but it is really faster?
There is no difference in speed between a query run inside a function and one run inside a procedure. Stored procedures have problems aggregating results, they cannot be composed with other stored procedures.
The biggest advantage of stored procedures is that they're compiled into the database, thus allowing high-speed processing.
The user defined function only allows select statements to be read while DML statements are not allowed. On the other hand, stored procedure allows use of both select statements, as well as the DML statements, which can also be updated and manipulated.
SPs can change database objects. Inline User-Defined Functions can be treated like views with parameters and can be used in row set operations and JOINs. Cannot JOIN the output of a Stored procedure. UDF can be used in the SQL statements anywhere in the WHERE / HAVING / SELECT sections.
There is no difference in speed between a query run inside a function and one run inside a procedure.
Stored procedures have problems aggregating results, they cannot be composed with other stored procedures. The onyl solution is really cumbersome as it involves catching the procedure output into a table with INSERT ... EXEC ...
and then using the resulted table.
Functions have the advantage of being highly composable as a table value function can be placed anywhere a table expression is expected (FROM, JOIN, APPLY, IN etc). But functions have some very severe limitations in terms of what is permitted in a function and what is not, exactly because they can appear anywhere in a query.
So is really apple to oranges. The decision is not driven be performance, but by requirements. As a general rule anything that returns a dataset should be a view or a table valued function. Anything that manipulates data must be a procedure.
There is a popular misconception that UDFs have adverse effect on performance. As a blanket statement, this is simply not true. In fact, inline table-valued UDFs are actually macros – the optimizer is very well capable rewriting queries involving them as well as optimizing them. However, scalar UDFs are usually very slow. I will provide a short example.
Here is the script to create and populate the tables:
CREATE TABLE States(Code CHAR(2), [Name] VARCHAR(40), CONSTRAINT PK_States PRIMARY KEY(Code))
GO
INSERT States(Code, [Name]) VALUES('IL', 'Illinois')
INSERT States(Code, [Name]) VALUES('WI', 'Wisconsin')
INSERT States(Code, [Name]) VALUES('IA', 'Iowa')
INSERT States(Code, [Name]) VALUES('IN', 'Indiana')
INSERT States(Code, [Name]) VALUES('MI', 'Michigan')
GO
CREATE TABLE Observations(ID INT NOT NULL, StateCode CHAR(2), CONSTRAINT PK_Observations PRIMARY KEY(ID))
GO
SET NOCOUNT ON
DECLARE @i INT
SET @i=0
WHILE @i<100000 BEGIN
SET @i = @i + 1
INSERT Observations(ID, StateCode)
SELECT @i, CASE WHEN @i % 5 = 0 THEN 'IL'
WHEN @i % 5 = 1 THEN 'IA'
WHEN @i % 5 = 2 THEN 'WI'
WHEN @i % 5 = 3 THEN 'IA'
WHEN @i % 5 = 4 THEN 'MI'
END
END
GO
Consider the following query:
SELECT o.ID, s.[name] AS StateName
INTO dbo.ObservationsWithStateNames_Join
FROM dbo.Observations o LEFT OUTER JOIN dbo.States s ON o.StateCode = s.Code
/*
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Observations'. Scan count 1, logical reads 188, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'States'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 187 ms, elapsed time = 188 ms.
*/
And compare it to a query involving an inline table valued UDF:
CREATE FUNCTION dbo.GetStateName_Inline(@StateCode CHAR(2))
RETURNS TABLE
AS
RETURN(SELECT [Name] FROM dbo.States WHERE Code = @StateCode);
GO
SELECT ID, (SELECT [name] FROM dbo.GetStateName_Inline(StateCode)) AS StateName
INTO dbo.ObservationsWithStateNames_Inline
FROM dbo.Observations
Both its execution plan and its execution costs are the same – the optimizer has rewritten it as an outer join. Don’t underestimate the power of the optimizer!
Here is a scalar UDF:
CREATE FUNCTION dbo.GetStateName(@StateCode CHAR(2))
RETURNS VARCHAR(40)
AS
BEGIN
DECLARE @ret VARCHAR(40)
SET @ret = (SELECT [Name] FROM dbo.States WHERE Code = @StateCode)
RETURN @ret
END
GO
Clearly the query using this UDF provides the same results but it has a different execution plan and it is dramatically slower:
/*
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 3 ms.
Table 'Worktable'. Scan count 1, logical reads 202930, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Observations'. Scan count 1, logical reads 188, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 11890 ms, elapsed time = 38585 ms.
*/
As you have seen, the optimizer can rewrite and optimize queries involving inline table valued UDFs. On the other hand, queries involving scalar UDFs are not rewritten by the optimizer – the execution of the last query includes one function call per row, which is very slow.
Not all UDFs are bad for performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With