Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is a CLR table valued function 'streaming''?

The MSDN Docs on table-valued Sql Clr functions states:

Transact-SQL table-valued functions materialize the results of calling the function into an intermediate table. ... In contrast, CLR table-valued functions represent a streaming alternative. There is no requirement that the entire set of results be materialized in a single table. The IEnumerable object returned by the managed function is directly called by the execution plan of the query that calls the table-valued function, and the results are consumed in an incremental manner. ... It is also a better alternative if you have very large numbers of rows returned, because they do not have to be materialized in memory as a whole.

Then I find out that no data access is allowed in the 'Fill row' method. This means that you still have to do all of your data access in the init method and keep it in memory, waiting for 'Fill row' to be called. Have I misunderstood something? If I don't force my results into an array or list, I get an error: 'ExecuteReader requires an open and available Connection. The connection's current state is closed.'

Code sample:

[<SqlFunction(DataAccess = DataAccessKind.Read, FillRowMethodName = "Example8Row")>]
static member InitExample8() : System.Collections.IEnumerable = 
   let c = cn() // opens a context connection
   // I'd like to avoid forcing enumeration here:
   let data = getData c |> Array.ofSeq
   data :> System.Collections.IEnumerable

static member Example8Row ((obj : Object),(ssn: SqlChars byref)) = 
   do ssn <- new SqlChars(new SqlString(obj :?> string))
   ()

I'm dealing with several million rows here. Is there any way to do this lazily?

like image 834
Robert Jeppesen Avatar asked Apr 13 '11 18:04

Robert Jeppesen


People also ask

What is the function of CLR?

CLR functions can be used to access external resources such as files, network resources, Web Services, other databases (including remote instances of SQL Server). This can be achieved by using various classes in the . NET Framework, such as System.IO , System.

What is CLR stored procedure in SQL Server?

Stored procedures are routines that cannot be used in scalar expressions. They can return tabular results and messages to the client, invoke data definition language (DDL) and data manipulation language (DML) statements, and return output parameters.

What is CLR in SSIS?

The common language runtime (CLR) is the heart of the Microsoft . NET Framework and provides the execution environment for all . NET Framework code. Code that runs within the CLR is referred to as managed code.

What is CLR time in SQL Server?

SQL CLR or SQLCLR (SQL Common Language Runtime) is technology for hosting of the Microsoft . NET common language runtime engine within SQL Server. The SQLCLR allows managed code to be hosted by, and run in, the Microsoft SQL Server environment.


3 Answers

I'm assuming you're using SQL Server 2008. As mentioned by a Microsoft employee on this page, 2008 requires methods to be marked with DataAccessKind.Read much more frequently than 2005. One of those times is when the TVF participates in a transaction (which seemed to always be the case, when I tested). The solution is to specify enlist=false in the connection string, which, alas, cannot be combined with context connection=true. That means your connection string needs to be in typical client format: Data Source=.;Initial Catalog=MyDb;Integrated Security=sspi;Enlist=false and your assembly must be created with permission_set=external_access, at minimum. The following works:

using System;
using System.Collections;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

namespace SqlClrTest {
    public static class Test {
        [SqlFunction(
            DataAccess = DataAccessKind.Read,
            SystemDataAccess = SystemDataAccessKind.Read,
            TableDefinition = "RowNumber int",
            FillRowMethodName = "FillRow"
            )]
        public static IEnumerable MyTest(SqlInt32 databaseID) {
            using (var con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")) {
                con.Open();
                using (var cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)) {
                    cmd.Parameters.AddWithValue("@DatabaseID", databaseID.IsNull ? (object)DBNull.Value : databaseID.Value);
                    using (var reader = cmd.ExecuteReader()) {
                        while (reader.Read())
                            yield return reader.GetInt32(0);
                    }
                }
            }
        }
        public static void FillRow(object obj, out SqlInt32 rowNumber) {
            rowNumber = (int)obj;
        }
    }
}

Here's the same thing in F#:

namespace SqlClrTest

module Test =

    open System
    open System.Data
    open System.Data.SqlClient
    open System.Data.SqlTypes
    open Microsoft.SqlServer.Server

    [<SqlFunction(
        DataAccess = DataAccessKind.Read,
        SystemDataAccess = SystemDataAccessKind.Read,
        TableDefinition = "RowNumber int",
        FillRowMethodName = "FillRow"
        )>]
    let MyTest (databaseID:SqlInt32) =
        seq {
            use con = new SqlConnection("data source=.;initial catalog=TEST;integrated security=sspi;enlist=false")
            con.Open()
            use cmd = new SqlCommand("select top (100) RowNumber from SSP1 where DatabaseID = @DatabaseID", con)
            cmd.Parameters.AddWithValue("@DatabaseID", if databaseID.IsNull then box DBNull.Value else box databaseID.Value) |> ignore
            use reader = cmd.ExecuteReader()
            while reader.Read() do
                yield reader.GetInt32(0)
        } :> System.Collections.IEnumerable

    let FillRow (obj:obj) (rowNumber:SqlInt32 byref) =
        rowNumber <- SqlInt32(unbox obj)

The good news is: Microsoft considers this a bug.

like image 172
Daniel Avatar answered Nov 17 '22 19:11

Daniel


Yes, you would need to pull the results into memory and then return from there. Although the intention would be that you avoid the need to do such operations.

You can see an example of the approach in one of the sections of the MSDN doc you linked to ("Sample: Returning the results of a SQL Query")

The examples are a bit contrived though as a real-world implementation of email validation would use a scalar rather than table function - returning a bool for each input email value rather than a list of those that are invalid.

Are you able to explain a bit more about what you're trying to achieve? There might be a better way of structuring the function.

like image 35
rob Avatar answered Nov 17 '22 18:11

rob


What you can do is to wrap an SqlDataReader class with an IEnumerable which uses an enumerator which, when its "Next" method is called, does MoveNext on the SqlDataReader and returns the SqlDataReader. Then, your FillRow method expects SqlDataReader as a class. If you have your enumerator close the database connection and the SqlDataReader when it can't "next" any more, then you've effectively streamed your output to the FillRows function. You can do this with a ContextConnection=true as well...

...the trouble here is that you have to be able to return the results of an actual query: if you're doing more complex things to create your result set, then you're out of luck.

like image 1
Dan Sutton Avatar answered Nov 17 '22 18:11

Dan Sutton