Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SqlDataReader and SQL Server 2016 FOR JSON splits json in chunks of 2k bytes

Recently I played around with the new for json auto feature of the Azure SQL database.

When I select a lot of records for example with this query:

Select
    Wiki.WikiId
    , Wiki.WikiText
    , Wiki.Title
    , Wiki.CreatedOn
    , Tags.TagId
    , Tags.TagText
    , Tags.CreatedOn
From
    Wiki
Left Join
    (WikiTag
Inner Join 
    Tag as Tags on WikiTag.TagId = Tags.TagId) on Wiki.WikiId = WikiTag.WikiId
For Json Auto

and then do a select with the C# SqlDataReader:

var connectionString = ""; // connection string
var sql = "";  // query from above
var chunks = new List<string>();

using (var connection = new SqlConnection(connectionString)) 
using (var command = connection.CreateCommand()) {
    command.CommandText = sql;
    connection.Open();

    var reader = command.ExecuteReader();

    while (reader.Read()) {
            chunks.Add(reader.GetString(0)); // Reads in chunks of ~2K Bytes
    }
}

var json = string.Concat(chunks);

I get a lot of chunks of data.

Why do we have this limitation? Why don't we get everything in one big chunk?

When I read a nvarchar(max) column, I will get everything in one chunk.

Thanks for an explanation

like image 874
VSDekar Avatar asked Aug 09 '17 16:08

VSDekar


1 Answers

From Format Query Results as JSON with FOR JSON:

Output of the FOR JSON clause

The result set contains a single column.

A small result set may contain a single row.

A large result set splits the long JSON string across multiple rows. By default, SQL Server Management Studio (SSMS) concatenates the results into a single row when the output setting is Results to Grid. The SSMS status bar displays the actual row count.

Other client applications may require code to recombine lengthy results into a single, valid JSON string by concatenating the contents of multiple rows. For an example of this code in a C# application, see Use FOR JSON output in a C# client app.

I would say it is strictly for performance reasons, similiar to XML. More SELECT FOR XML AUTO and return datatypes and What does server side FOR XML return?

In SQL Server 2000 the server side XML publishing - FOR XML (see http://msdn2.microsoft.com/en-us/library/ms178107(SQL.90).aspx) - was implemented in the layer of code between the query processor and the data transport layer. Without FOR XML a SELECT query is executed by the query processor and the resulting rowset is sent to the client side by the server side TDS code. When a SELECT statement contains FOR XML the query processor produces the result the same way as without FOR XML and then FOR XML code formats the rowset as XML. For maximum XML publishing performance FOR XML does steaming XML formatting of the resulting rowset and directly sends its output to the server side TDS code in small chunks without buffering whole XML in the server space. The chunk size is 2033 UCS-2 characters. Thus, XML larger than 2033 UCS-2 characters is sent to the client side in multiple rows each containing a chunk of the XML. SQL Server uses a predefined column name for this rowset with one column of type NTEXT - “XML_F52E2B61-18A1-11d1-B105-00805F49916B” – to indicate chunked XML rowset in UTF-16 encoding. This requires special handling of the XML chunk rowset by the APIs to expose it as a single XML instance on the client side. In ADO.Net, one needs to use ExecuteXmlReader, and in ADO/OLEDB one should use the ICommandStream interface.

like image 195
Lukasz Szozda Avatar answered Nov 06 '22 06:11

Lukasz Szozda