Queries that include a column with large NVARCHAR values from from SQL Azure are slow

Tags:

I have a table in an Azure database that has begun responding slowly to queries. The query looks like:

SELECT [Id] --nvarchar(128), PrimaryKey
      ,[Name] --nvarchar(max)
      ,[Description] --nvarchar(max)
      ,[Modified]  --datetime2(7)
      ,[LastModifiedBy] --nvarchar(max)
      ,[Opened]  --datetime2(7)
      ,[Editor] --nvarchar(max)
      ,[Json] --nvarchar(max)   <--THIS IS GIVING ME PROBLEMS
      ,[IsActive] --bit
  FROM [dbo].[TableName]

Specifically, when I include the [Json] column in the query, the SQL query performance goes from less than a second to minutes. Even requesting only a single record can take minutes when the [Json] column is included. This column contains long json-formatted strings (~500000 characters). The performance only breaks down when this column is included -- the other NVARCHAR(max) columns that contain smaller strings are not a problem.

I discovered this issue through performance issues of a MVC5 application using an Entity Framework linq-to-entities query:

Click to copy

var model=await db.TableName.FirstOrDefaultAsync(s => s.Id == id);

which produced a sql query looking like the one above. The Edit method for a single case which had run with no problems on a local development machine was taking minutes to load on the server. I then looked into direct db queries to see what the issue is and found the long query times.

This performance issue is not consistent across different methods of querying.

I have a turnaround time of 3 minutes with the following query:

Click to copy

SELECT Json FROM [dbo].[TableName] WHERE [Id]=<id>

The turnaround time is exponentially proportionate to the returned string length. For example, this query takes about 10 seconds:

Click to copy

SELECT SUBSTRING(Json,1,50000) FROM [dbo].[TableName] WHERE [Id]=<id>

Queries on the server like the following take less than a second.:

Click to copy

DECLARE @variable nvarchar(max);
SELECT @variable=Json FROM [dbo].[TableName] where Id='<id>';
SELECT LENGTH(@variable);

but actually retrieving the data as in the following takes me back up to several minutes:

Click to copy

DECLARE @variable nvarchar(max);
SELECT @variable=Json FROM [dbo].[TableName] where Id='<id>';
SELECT @variable;

My ultimate goal is figuring out how to get Entity Framework's linq-to-entities query to perform at a reasonable speed so I can use the data in C#, and I do not think I can force EF to produce such a query dynamically.

I have never encountered this difficulty before with other tables storing large strings. Is there a setting I have mistakenly set incorrectly, or is there a best practice for building EF linq-to-sql statements in this situation?

For comparison, there is no performance issue when running the queries on a local instance of SQL Server with a copy of the same database; all queries return in less than a second.

-UPDATE-

I have been monitoring, and this issue has disappeared without any code change. All query response times are back to under a second. However, there was also no notification of service outage from Azure. In fact, throughout the duration of the issue, the database was completely accessible, and the only issues was the slow queries involving fields returning large string values.

The downside is that I can no longer reproduce the issue.

For others with this issue on Azure (which appears to be irregular), the diagnostic symptoms of this behavior are:

no Azure-SQL Server outage
healthy response times for queries that do not return large string values
normal resource usage of the Azure-SQL Server for any queries, even if they return large string values
In dependent applications, two types of errors are thrown by the connecting application: a) connection timeout errors, and b) closed connection errors. There is no contextual information that distinguishes when either type of error gets thrown.
The response time for queries involving long strings is exponential to the length of the string returned. For example, 10 characters is instantaneous, 50000 takes 10 seconds, 500000 takes minutes, etc. However, the response times are not consistent.
String processing (even on very long strings) performed entirely on the server that does NOT require returning a long string value takes a normal amount of time.

I will leave this question open in case someone has an actual answer, but it appears that the solution was to simply wait for Azure to resolve whatever they were modifying about their request handling. It appears that the issue related to the transfer of data from Azure DB rather than the processing on the server. My top recommendation is to not tear apart code that is working flawlessly on the development box if the issue is characterized by the symptoms noted above.

771

asked Aug 23 '16 01:08

longestwayround

1 Answers

It really is too bad you can't reproduce this error, because I have a thought on what your issue is, and it all comes from your statement "when I include the [Json] column in the query, the SQL query performance goes from less than a second to minutes." You also gave me the clue:

Queries on the server like the following take less than a second.:

Click to copy

DECLARE @variable nvarchar(max);
SELECT @variable=Json FROM [dbo].[TableName] where Id='<id>';
SELECT LENGTH(@variable);

When retrieving data is the issue, you've got a problem with the the wait type: ASYNC_NETWORK_IO. Basically sending the data out of the sql server and into the application waiting is the problem.

I will ask where the application is running. Is the application in azure in the same data center as the database, or is it running from outside that data center. The closer you can get your application to the data, the less you will see this wait type.

The other question I will ask is about the hardware running the application, does it have sufficient memory to receive all that data, and then process it in a useful way. Occasionally the network wait is actually a problem with under powered hardware on the application side.

I have a couple additional thoughts to share: If you're going to deal with very large JSON objects, have you considered using DocumentDB to store them, instead of Azure SQL Database? It's optimized for that kind of work load, plus you can now write T-SQL to query JSON files stored in DocumentDB.

Regarding:

In dependent applications, two types of errors are thrown by the connecting application: a) connection timeout errors, and b) closed connection errors. There is no contextual information that distinguishes when either type of error gets thrown.

You're going to need some kind of retry policy in any application hitting a database. It's critical when dealing with Azure SQL Database, since you have three copies of your database at any given time, and you may need to suffer a fail over in the middle of the day. If you have that retry policy, end users will never know there was a problem, since the secondary copies are available in under a second.

I hope this helps!

175

answered Oct 25 '22 04:10

Shannon Lowder

Related questions
                            
                                VS2013 database project - ignore schemas/tables matching filter?
                            
                                SQL Server view Stored Procedures full relationships
                            
                                Parse SQL Script to extract table and column names
                            
                                How to show progress in a batch file using sqlcmd?
                            
                                Linq to SQL: Why decimal field get truncated on Insert?
                            
                                How can I make this query sargable?
                            
                                EntityFramework package version="6.1.3" and Web config version 6.0.0.0?
                            
                                Why predicate locks cannot be acquired through an explicit locking query syntax
                            
                                Segfault on 2nd connection with pyodbc to mirrored MS SQL Server
                            
                                Clear SQL Azure execution plan / query cache
                            
                                Entity framework 6 projection generates SQL equivalent to "Select *" and does not produce WHERE clause
                            
                                SQL max() function with a where clause and group by does not use the index efficiently
                            
                                Which type of repair level is "DBCC CHECKDB (databasename, repair)"?
                            
                                How delete many-to-many relation in Entity Framework 6
                            
                                UNION of non-nullable columns is nullable
                            
                                Crystal report with linked subreport only working on Report Preview
                            
                                Could not load file or assembly 'Microsoft.SqlServer.Types, Version=12.0.0.0, Culture=neutral, PublicKeyToken=myKey' or one of its dependencies.
                            
                                Laravel: PDO Exception, cannot find driver even though the driver is installed and tested [duplicate]
                            
                                SQL Server 2008 Merge Statement Multiple Match Conditions
                            
                                Transaction log for database 'tempdb' is full due to 'ACTIVE_TRANSACTION

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Queries that include a column with large NVARCHAR values from from SQL Azure are slow

Tags:

sql-server

entity-framework

azure-sql-database

longestwayround

People also ask

1 Answers

Shannon Lowder

Recent Activity

Donate For Us