"cursor like" reading inside a CLR procedure/function

Tags:

I have to implement an algorithm on data which is (for good reasons) stored inside SQL server. The algorithm does not fit SQL very well, so I would like to implement it as a CLR function or procedure. Here's what I want to do:

Execute several queries (usually 20-50, but up to 100-200) which all have the form select a,b,... from some_table order by xyz. There's an index which fits that query, so the result should be available more or less without any calculation.
Consume the results step by step. The exact stepping depends on the results, so it's not exactly predictable.
Aggregate some result by stepping over the results. I will only consume the first parts of the results, but cannot predict how much I will need. The stop criteria depends on some threshold inside the algorithm.

My idea was to open several SqlDataReader, but I have two problems with that solution:

You can have only one SqlDataReader per connection and inside a CLR method I have only one connection - as far as I understand.
I don't know how to tell SqlDataReader how to read data in chunks. I could not find documentation how SqlDataReader is supposed to behave. As far as I understand, it's preparing the whole result set and would load the whole result into memory. Even if I would consume only a small part of it.

Any hint how to solve that as a CLR method? Or is there a more low level interface to SQL server which is more suitable for my problem?

Update: I should have made two points more explicit:

I'm talking about big data sets, so a query might result in 1 mio records, but my algorithm would consume only the first 100-200 ones. But as I said before: I don't know the exact number beforehand.
I'm aware that SQL might not be the best choice for that kind of algorithm. But due to other constraints it has to be a SQL server. So I'm looking for the best possible solution.

717

asked Aug 22 '11 20:08

Achim

2 Answers

SqlDataReader does not read the whole dataset, you are confusing it with the Dataset class. It reads row by row, as the .Read() method is being called. If a client does not consume the resultset the server will suspend the query execution because it has no room to write the output into (the selected rows). Execution will resume as the client consumes more rows (SqlDataReader.Read is being called). There is even a special command behavior flag SequentialAccess that instructs the ADO.Net not to pre-load in memory the entire row, useful for accessing large BLOB columns in a streaming fashion (see Download and Upload images from SQL Server via ASP.Net MVC for a practical example).

You can have multiple active result sets (SqlDataReader) active on a single connection when MARS is active. However, MARS is incompatible with SQLCLR context connections.

So you can create a CLR streaming TVF to do some of what you need in CLR, but only if you have one single SQL query source. Multiple queries it would require you to abandon the context connection and use isntead a fully fledged connection, ie. connect back to the same instance in a loopback, and this would allow MARS and thus consume multiple resultsets. But loopback has its own issues as it breaks the transaction boundaries you have from context connection. Specifically with a loopback connection your TVF won't be able to read the changes made by the same transaction that called the TVF, because is a different transaction on a different connection.

136

answered Oct 07 '22 23:10

Remus Rusanu

SQL is designed to work against huge data sets, and is extremely powerful. With set based logic it's often unnecessary to iterate over the data to perform operations, and there are a number of built-in ways to do this within SQL itself.

1) write set based logic to update the data without cursors

2) use deterministic User Defined Functions with set based logic (you can do this with the SqlFunction attribute in CLR code). Non-Deterministic will have the affect of turning the query into a cursor internally, it means the value output is not always the same given the same input.

[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static int algorithm(int value1, int value2)
{
    int value3 = ... ;
    return value3;
}

3) use cursors as a last resort. This is a powerful way to execute logic per row on the database but has a performance impact. It appears from this article CLR can out perform SQL cursors (thanks Martin).

I saw your comment that the complexity of using set based logic was too much. Can you provide an example? There are many SQL ways to solve complex problems - CTE, Views, partitioning etc.

Of course you may well be right in your approach, and I don't know what you are trying to do, but my gut says leverage the tools of SQL. Spawning multiple readers isn't the right way to approach the database implementation. It may well be that you need multiple threads calling into a SP to run concurrent processing, but don't do this inside the CLR.

To answer your question, with CLR implementations (and IDataReader) you don't really need to page results in chunks because you are not loading data into memory or transporting data over the network. IDataReader gives you access to the data stream row-by-row. By the sounds it your algorithm determines the amount of records that need updating, so when this happens simply stop calling Read() and end at that point.

SqlMetaData[] columns = new SqlMetaData[3];
columns[0] = new SqlMetaData("Value1", SqlDbType.Int);
columns[1] = new SqlMetaData("Value2", SqlDbType.Int);
columns[2] = new SqlMetaData("Value3", SqlDbType.Int);

SqlDataRecord record = new SqlDataRecord(columns);
SqlContext.Pipe.SendResultsStart(record);

SqlDataReader reader = comm.ExecuteReader();

bool flag = true;

while (reader.Read() && flag)
{
    int value1 = Convert.ToInt32(reader[0]);
    int value2 = Convert.ToInt32(reader[1]);

    // some algorithm 
    int newValue = ...;

    reader.SetInt32(3, newValue);        

    SqlContext.Pipe.SendResultsRow(record);

    // keep going?
    flag = newValue < 100;
 }

answered Oct 07 '22 23:10

TheCodeKing

Related questions
                            
                                Specification Pattern and Performance
                            
                                Repaint issues when switching between programs
                            
                                Java: Create an object whose type is a type parameter
                            
                                Store file in SQL CE 4 using Entity Framework Code-First approach
                            
                                How to create a editor template for DateTime with 3 fields?
                            
                                LINQ serialization
                            
                                FileSystemWatcher on mapped network drive
                            
                                WCF Streaming File Transfer ON .NET 4
                            
                                Mock File.Exists method in Unit Test (C#) [duplicate]
                            
                                app.config: how do I make a nested customSection called appSettings be the ConfigurationManager.AppSettings
                            
                                efficient way to find string with streamreader
                            
                                Regex.Match and noncapturing groups
                            
                                Bluetooth send/receive text without pairing using C# on 2 Windows 7 Computers
                            
                                Revert C# class template to default one
                            
                                Is it possible to use Assembly.ReflectionOnlyLoad together with publisher policies / assembly versioning?
                            
                                Execute implicit cast at runtime
                            
                                Linq - Equivalent to BETWEEN inside a Left Join
                            
                                ASP.NET MVC - Approach for global error handling?
                            
                                Help to analyze how a software/program constructs Bezier curve
                            
                                redefine spring.net object in multiple configuration files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

"cursor like" reading inside a CLR procedure/function

Tags:

c#

sql-server-2008

sqlclr

Achim

People also ask

2 Answers

Remus Rusanu

TheCodeKing

Recent Activity

Donate For Us