Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure SQL Server Read Scale-Out Performance

As a test, we recently started sending some queries to our read scale-out SQL Server in Azure. Our primary DB is a 40 core BC. We have noticed some pretty poor performance with queries taking up to 10x or longer time to perform than in our primary DB.

I'm assuming there isn't much I can do about this? There is no query store and it doesn't look like there is any way to tune the DB?

like image 463
Shawn Clark Avatar asked Mar 06 '26 12:03

Shawn Clark


1 Answers

We had similar issue with read scale-out replica. We used Azure SQL vCore 40 Business Critical for OLTP. After countless hours of debugging and monitoring we determined that read scale-out replica is heavely affected by the number of writes in primary.

In our case we observed high number of SOS_SCHEDULER_YIELD wait type for read queries in read scale-out replica and CPU usage up to 100%. If we switched the same read queries to the primary database CPU never went above 30% and SOS_SCHEDULER_YIELD was not prevalent wait type.

Following query were showing that SOS_SCHEDULER_YIELD was dominant signal wait time in read scale-out, but not in primary.

select * from sys.dm_os_wait_stats 
order by signal_wait_time_ms desc

We reduced our problem to small test case with copy of our production database, only 3 read queries, 1 heavy write query and .NET app that runs read queries in parallel. We used 12 vCore Business Critical database.

  • Test 1: Execute read queries on primary. Result: ~ 1500 read queries/second, 60% CPU usage in sys.dm_db_resource_stats in primary
  • Test 2: Execute read queries on read-scale out. Result: ~ 1500 read queries/second, 60% CPU usage in sys.dm_db_resource_stats in read scale-out
  • Test 3: Execite write query on primary and read queries on primary. Result: ~ 1500 read queries/second, 70% CPU usage in sys.dm_db_resource_stats in primary
  • Test 4: Execite write query on primary and read queries on read scale-out. Result: ~ 800 read queries/second, 100% CPU usage in sys.dm_db_resource_stats in read scale out and 10% in primary

Microsoft support and product team confirmed that there is an issue with spin locks in current replication model in Azure SQL and made temporary fix for our database - switched replication from parallel to some other model. This solved the issue for us. We are not sure if fix went to all of the databases.

like image 58
samfromlv Avatar answered Mar 08 '26 21:03

samfromlv



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!