Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cosmos db graph vs Azure Sql Server - Performance and cost

Imagine a social network app. Users follow other users and users take photos. Photos have tags of other users.

I'm trying to get an effective Cosmos db implementation of a graph for that app. I provide an SQL Server version as well as a benchmark.

Here is the graph: enter image description here

Here is a table version of it:

enter image description here

Here is the Gremlin query:

g.V('c39f435b-350e-4d08-a7b6-dfcadbe4e9c5')
.out('follows').as('name')
.out('took').order(local).by('postedAt', decr).as('id', 'postedAt')
.select('id', 'name', 'postedAt').by(id).by('name').by('postedAt')
.limit(10)

Here is the equivalent SQL query (linq actually):

Follows
.Where(f => f.FollowerId == "c39f435b-350e-4d08-a7b6-dfcadbe4e9c5")
.Select(f => f.Followees)
.SelectMany(f => f.Photos)
.OrderByDescending(f => f.PostedAt)
.Select(f => new { f.User.Name, f.Id, f.PostedAt})
.Take(10)

That user follows 136 users who collectively took 257 photos.

Both SQL Server and Cosmos db are in West Europe Azure location. I'm in France. I did a bit of testing on Linpad.

  • The Gremlin Query runs in over 1.20s and consumes about 330 RU. FYI, 400RU/s costs 20$/month.
  • The SQL query runs in 70ms. The db is 10 DTU (1 instance of S0). So it costs 12.65eur / month

How can I get the feed faster and cheaper with cosmos db?

Note: In order to get the RU charged, I'm using Microsoft.Azure.Graph. But I can also use Gremlin.Net and get similar results.

like image 727
François Avatar asked Mar 02 '18 09:03

François


People also ask

Is Cosmos DB cheaper than SQL Server?

If your requirement is to use a globally distributed and multi-model database. When you have a bit more budget as the cosmos DB is a bit expensive as compared to SQL server.

How expensive is Cosmos DB?

The Serverless Cosmos DB calculates costs per million request units. For example, it costs $0.25 for 1 million RUs in East US. Serverless tier also costs for transactional storage apart from the request units. It charges $0.250 per GB \month.

What is the difference between Cosmos DB and Azure SQL?

Azure SQL is based on SQL Server engine, you can easily migrate applications and continue to use the tools, languages, and resources that you're familiar with. Azure Cosmos DB is used for web, mobile, gaming, and IoT application that needs to handle massive amounts of data, reads, and writes at a global scale.

Can Cosmos DB replace SQL Server?

Cosmos DB is not a replacement for SQL Server. You would very, very rarely, if ever, migrate your data from an existing SQL Server database to Cosmos DB.


1 Answers

I know this question is old but here is my tip to help you use cosmos db in efficient way and reduce the RU/s as possible.

330 RU is a lot for such a query, the problem here that makes you consume a lot of RU is the partitioning, when you add partition to the database you are telling cosmos db to partition the data logically by the partition key that you provide, so in your case the best partition key is the user.

Generally to know the best partition key you should first start from your queries, so for example write down all you queries and check what is the top attribute or field you filter your queries with to get back your data. the attribute you choose is the partition key.

If you didn't add partition key you will tell cosmosdb to search for users and if users are spread on many servers and many partitions when scaled, the cosmosdb will search in all partitions ( servers ) and this will cost you a lot, so if you have for example 6 servers, cosmosdb will run queries on the 6 servers till it finds your user, it may find it in first server or the second but also it may find it in the last server so it will take a lot of time and not guaranteed.

The second thing is containers, container is the unit of scaling in cosmosdb, so when cosmosdb wants to scale, it scales the container and all the data in it. so a good practice is to add entities that are queried a lot in their own container so cosmosdb can scale them easily using the partition key assigned to each container.

Maybe I helped you to reduce the RU/s in a different way. hope this answer helps who face the same problem.

like image 80
ahmed nader Avatar answered Oct 05 '22 23:10

ahmed nader