The Problem
My application works as follows:
- Multiple (< 20) device clients (Android) are running at a single location.
- Thousands of locations exist (therefore tens or hundreds of thousands of device clients exist).
- A web portal client also exists that works in sync with each location's data and its device clients.
- New data generated on a device is posted to the server (cloud) via a REST API (ASP.net WebAPI).
So far this application is a pretty standard application with a mobile device client and web portal client.
However, due to requirements on each device client that is out of my control (device clients need to function in offline mode, reduce network latency, etc), each device client does not use the server database as its immediate source of record. Each device client has its own local database (SQLite) that stays in sync with all data for its location. For example: when I make a data change on device client A, that change needs to be propagated to device client B and to web portal client C.
- The web portal client reads directly from the server database since it does not need offline functionality.
As you can see, the problem here is that we now need a way to keep all device client databases in sync with each other in real time. Brief delays in data being in sync between two device clients is expected and considered okay.
Proposed Solution
My proposed solution is as follows:
- When a new client device comes online initially, it receives a data dump for what it has missed since the last time it was online from the server via REST API.
- Each new data item posted/updated/deleted from client devices via REST API is propagated through to the server database. The server database houses all data for all locations and should be considered as the permanent source of record.
- The web portal works directly with the server database since it has no offline type requirements.
- A connection from each client device is established to a data sync stream service via SignalR.
- A worker service is "tailing" the server database for new Create/Update/Delete operations. When a CUD operation is detected, a message is dispatched to an Azure Service Bus queue/subscription (via fan-out topic) for each data sync service instance. This allows for horizontal scaling of the SignalR data sync service (with an Azure Service Bus backplane) since thousands of device client connections will exist.
- The data sync service reads from its message queue/subscription and pushes a sync message (containing all needed data for the sync) to all connected client devices (for the location related to the data) via SignalR.
The following diagram illustrates this solution:
- Large blocks depict servers (gray squares are HTTP web servers that can be horizontally scaled)
- Arrows depict the direction of data flowing through the application.
Questions
- Is SignalR the right technology for this problem/solution? Originally my solution involved each client device establishing it's own Azure Service Bus queue/subscription that collected messages from the database-tailing worker (sync river). The problem with that solution is that I would be pushing lots of wasted messages to offline device clients that may not come back online for a very long time, if ever. By dumping back the delta data when a device client comes online initially and streaming data via SignalR thereafter I can solve this.
- I have not used SignalR extensively in a production environment before, so I am a bit new with it. What problems/challenges can I expect to experience with it for this solution?
- The following article states that "There are some scenarios where a backplane can become a bottleneck. Here are some typical SignalR scenarios: High-frequency realtime (e.g., real-time games): A backplane is not recommended for this scenario.". Would this solution fall into this category? What problems could the backplane of Azure Service Bus messaging introduce? How else would I scale this solution if not in this way?
Your general opinions and recommendations for this solutions are also welcome and appreciated.
You have a requirement on real-time communication to devices when they are online. One of the most promising ways to do this is by using web sockets.
- Using web socket itself is not practical and so there are popular
libraries for it such as SignalR, socket.io. These libraries absorb
many difficulties faced in production and also in development. These
libraries even support scaling.
- Since your stack is .net based SignalR is choice here.
- SignalR will work well in most of the cases. Here you don't have to
worry on backplane becoming a bottleneck as given in a real -time
games.
But maintaining a self-hosted real-time solution such as SignalR comes with a cost. The success rate of communication will be not high reliable in stock SignalR and you will have to implement various monitoring mechanisms and failover processes. Geo-distribution also not supported. So the next choice for a high reliable real-time system which addresses all mention issues is a hosted service such as pub-nub.