The background:
We have an application where the main entity is a customer. All information in this applications starts from the customer. We thought it would be really nice if we can use this for some kind of partitioning. We designed the service with Azure SQL Database as a backend.
Our tables look like this (only the relevant part is left for brevity):
TABLE dbo.Orders
(
CustomerId INT NOT NULL DEFAULT( FEDERATION_FILTERING_VALUE( 'FEDERATION_BY_CUSTOMER' ) ),
OrderId INT NOT NULL,
....,
CONSTRAINT PK_Orders PRIMARY KEY CLUSTERED ( CustomerId, OrderId )
) FEDERATED ON ( FEDERATION_BY_CUSTOMER = CustomerId );
Now this allowed us to do some crazy things. Our entry points to all SQL related stuff always contains the following command first:
USE FEDERATION GroupFederation( FEDERATION_BY_CUSTOMER = 1 ) WITH RESET, FILTERING = ON
In this case this statement:
SELECT * FROM Orders
or
INSERT INTO Orders ( OrderId ) VALUES ( 10 );
will work without a problem, working only on the data of the given customer. The CustomerId COLUMN will always be inferred from the system function FEDERATION_FILTERING_VALUE;
Now we could have all our customer in a single database without a problem, and they would be isolated from each other. If sometime in the future, one of them got too big, we could SPLIT the federation at that particular customer ID and we don't have to change anything in our code to support it.
Heck, we could have each customer in separated federation database and the service using it wouldn't know a single thing about it.
We were very happy with our solution and I thought I was very clever coming up with it. Not until recently when Microsoft announced that they are deprecating the azure federations feature with the new azure database editions that are coming up. Read more about it here and here.
I hope you see my problem. What do you think my alternatives are? Do you use Azure Federations and how are you going to transition?
Thank you.
SQL Azure federation provides tools that allow developers to scale out (by sharding) in SQL Azure. Here are some of the benefits of a sharded database: Taking advantage of greater resources within the cloud on demand. Allowing customers to have their own database, to share databases or to access many databases.
Sign in to the Azure portal. Go to your SQL Server virtual machine resource. Under Settings, select Configure. Then select your desired edition of SQL Server from the drop-down list under Edition.
OPENROWSET with non-Azure blob storage data source isn't supported in Azure SQL Database.
In general, we can't stop Azure SQL Databases.
We have seen that custom sharding solutions typically lead to better results regarding scalability, flexibility, and performance as compared to Federations. You can find more information about Federations and custom sharding here: http://msdn.microsoft.com/en-us/library/dn495641.aspx. This is part of the reason for the announcement to retire Federations together with the Web and Business Editions in Windows Azure SQL Database.
I would encourage you to look into self-sharding as an alternative. There was good guidance published last year by the CAT team around self-sharding patterns at http://social.technet.microsoft.com/wiki/contents/articles/17987.cloud-service-fundamentals.aspx , and more material like this will be coming shortly.
Feel free to reach out to me to discuss your alternatives for migrating your existing Federations application. I am part of the Azure DB product team and you can reach me at torsteng(at)microsoft.com
Thanks,
Torsten
Your only option is to rewrite your software. Microsoft is lying when they are saying that a custom sharding solution is better. Actually they told for years, why you SHOULD prefer federations to custom sharding and they provided very solid points for this: connection pooling, no downtime when scaling, no custom code to manage your shards etc. However, now Microsoft want to get much more money from Azure, so they decided to provide only very expensive plans to customers and there is no place for federations anymore, because they are very cheap and very scalable.
If you don't have a super high transaction rate you could try Amazon RDS. There is no sharding, but they provide you up to 30000 transactions per second, which is a lot even for very large websites. We rewrote our software for PostgreSQL RDS and we are very happy with it. It has much more features than SQL Server like native JSON support, arrays, multiple CASCADE paths etc.
If 30000 transactions per second is not enough for you, just increase the number of servers and split your data based on username or any other property, eg:
Username starts with letter
a-d | connect to Server 1
e-g | connect to Server 2
h-s | connect to Server 3
t-z | connect to Server 4
In the example above, you will be able to execute 120000 transactions per second and you shouldn't have problems with connection pool because there are only 4 servers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With