Most efficient way to communicate between multiple .NET apps

Tags:

Currently i have a setup where my clients (web apps, iOS app etc) talks to my backend API .NET web app (Nancy) via REST calls. Nothing special.

I now have a requirement to split this API up into microservices, where each service can be individually upgraded/deployed.

My main API (public) will just perform authentication, then call into one of my microservices, which will be in my private network.

What's the different ways i could communicate between my main API and other microservice API's? Pros/cons of each approach?

The communication needs to be realtime - e.g request comes in from a browser/device, main API performs auth, then calls into microservice API then returns response. So i can't use things like queues or pub/sub. It doesn't necessarily need to use HTTP, but it needs to be realtime communication (request/response). I also have other services (WebJobs, cloud services, etc) that need to talk to these microservices (they are also in the private network).

The only approach that comes to mind is simple REST-based calls. Totally fine, but latency is the main issue here.

Can anyone recommend any other solutions to this problem? Is there anything in Azure suited to this?

Many thanks

789

asked Feb 24 '16 21:02

RPM1984

1 Answers

First - clarify your distinctions between "real-time", "synchronous/asynchronous" and "one-way/two-way". The things you rule out (queues, and pub/sub) can certainly be used for two-way request/response, but they are asynchronous.

Second - clarify "efficiency" - efficiency on what metric? Bandwidth? Latency? Development time? Client support?

Third - realize that (one of) the costs of microservices is latency. If that's an issue for you at your first integration, you're likely in for a long road.

What's the different ways i could communicate between my main API and other microservice API's? Pros/cons of each approach?

Off the top of my head:

Host on single node(s) and IPC: pros: performance; cons: tight coupling to deployment topology; reachability to other nodes; client support; failure modes
REST/SOAP/etc. endpoints: pros: wide client and server support; debugging; web caching, cons: performance; failure modes
Binary protocol: pros: performance; cons: versioning; client and server support; debugging; failure modes
Messaging Queues: pros: asynchronicity; cons: complexity
ESB: pros: asynchronicity; client support; cons: complexity
Flat files: pros: bandwidth; simplicity; client support; cons: latency, general PITA

You'll note that this is the same list when we tie multiple applications together...because that's what you're doing. Just cause you made the applications smaller doesn't really change much except makes your system even more distributed. Expect to solve all the same problems "normal" distributed systems have, and then a few extra ones related to deployment and versioning.

Consider an idempotent GET request from a user like "Get me question 1". That client expects a JSON response of question 1. Simple. In my expected architecture, the client would hit api.myapp.com, which would then proxy a call via REST to question-api.myapp.com (microservice) to get the data, then return to user. How could we use pub/sub here? Who is the publisher, who is the subscriber? There's no event here to raise. My understanding of queues: one publisher, one consumer. Pub/sub topic: one publisher, many consumers. Who is who here?

Ok - so first, if we're talking about microservices and latency - we're going to need a more representative example. Let's say our client is the Netflix mobile app, and to display the opening screen it needs the following information:

List of trending movie ids
List of recently watched movie ids
Account status
For each movie id referenced: the name, # of stars, summary text
List of movie ids not available in your region (to be filtered out of trending/recently watched)

Each one of those is provided by a different microservice (we'll call them M1-M5). Each call from client -> datacenter has 100ms expected latency; calls between services have 20ms latency.

Let's compare some approaches:

1: Monolithic service

T0 + 100ms: Client sends request for /API//StartScreen; receives response

As expected, that's the lowest latency option - but requires everything in a monolithic service, which we've decided we don't want because of operational concerns.

2: Microservices

T0 + 100ms: Client sends request to M1 /API/Trending
T1 + 100ms: Client sends request to M2 /API//Recent
T2 + 100ms: Client sends request to M3 /API//Account
T3 + 100ms: Client sends request to M4 /API/Summary?movieids=[]
T4 + 100ms: Client sends request to M5 /API//Unavailable

That's 500ms. Using a proxy w/this isn't going to help - it'll just add 20ms latency to each request (making it 600ms). We have a dependency between 1 + 2 and 4, and 3 and 5, but can do some async. Let's see how that helps.

3: Microservices - Async

T0: Client sends request to M1 /API/Trending
T0: Client sends request to M2 /API//Recent
T0: Client sends request to M3 /API//Account
T0 + 200ms: (response from 1 + 2) Client sends request to M4 /API/Summary?movieids=[]
T0 + 200ms: (response from 3) Client sends request to M5 /API//Unavailable

We're down to 200ms; not bad - but our client needs to know about our microservice architecture. If we abstract that with our proxy, then we have:

4: Microservices - Async w/Gateway

T0 + 100ms: client sends request to G1 /API/StartScreen
T1: G1 sends request to M1 /API/Trending
T1: G1 sends request to M2 /API//Recent
T1: G1 sends request to M3 /API//Account
T1 + 40ms: (response from 1 + 2) G1 sends request to M4 /API/Summary?movieids=[]
T1 + 40ms: (response from 3) G1 sends request to M5 /API//Unavailable

Down to 140ms, since we're leveraging the decreased intra-service latency.

Great - when things are working smoothly, we've only increased latency by 40% compared to monolithic (#1).

But, as with any distributed system, we also have to worry about when things aren't going smoothly.

What happens when M4's latency increases to 200ms? Well, in the client -> async microservice route (#3), then we have partial page results in 100ms (the first batch of requests), unavailable in 200ms and summaries in 400ms. In the proxy case (#4), we have nothing until 340ms. Similar considerations if a microservice is completely unavailable.

Queues are a way of abstracting producer/consumer in space and time. Let's see what happens if we introduce one:

5: Microservice with async pub/sub

T0 + 100ms: client publishes request Q0 to P1 StartScreen, with a reply channel of P2
T1 + 20ms: M1 sees Q0, puts response R1 to P2 Trending/Q0
T1 + 20ms: M2 sees Q0, puts response R2 to P2 Recent/Q0
T1 + 20ms: M3 sees Q0, puts response R3 to P2 Account/Q0
T2 + 40ms: M4 sees R1, puts response R4a to P2 Summary/Q0
T2 + 40ms: M4 sees R2, puts response R4b to P2 Summary/Q0
T2 + 40ms: M5 sees R3, puts response R5 to P2 Unavailable/Q0

Our client, who is subscribed to P2 - receives partial results w/a single request and is abstracted away from the workflow between M1 + M2 and M4, and M3 and M5. Our latency in best case is 140ms, same as #4 and in worst case is similar to the direct client route (#3) w/partial results.

We have a much more complicated internal routing system involved, but have gained flexibility w/microservices while minimizing the inevitable latency. Our client code is also more complex - since it has to deal with partial results - but is similar to the async microservice route. Our microservices are generally independent of each other - they can be scaled independently, and there is no central coordinating authority (like in the proxy case). We can add new services as needed by simply subscribing to the appropriate channels, and having the client know what to do with the response we generate (if we generate one for client consumption of course).

6 Microservices - Gateway with Queues

You could do a variation of using a gateway to aggregate responses, while still using queues internally. It would look a lot like #4 externally, but #5 internally. The addition of a queue (and yes, I've been using queue, pub/sub, topic, etc. interchangeably) still decouples the gateway from the individual microservices, but abstracts out the partial result problem from the client (along w/its benefits, though).

The addition of a gateway, though, does allow you to handle the partial result problem centrally - useful if it's complex, ever changing, and/or reimplemented across multiple platforms.

For instance, let's say that, in the event that M4 (the summary service) is unavailable - we have a M4b that operates on cached data (so, eg., the star rating is out of date). M4b can answer the R4a and R4b immediately, and our Gateway can then determine if it should wait for M4 to answer or just go w/M4b based on a timeout.

For further info on how Netflix actually solved this problem, take a look at the following resources:

Their API Gateway
The messaging patterns underlying their fault-tolerance layer
More about IPC in microservices

198

answered Sep 26 '22 10:09

Mark Brackett

Related questions
                            
                                Cast a null into something?
                            
                                Lambda variable capture in loop - what happens here? [duplicate]
                            
                                Convert JSON response stream to string
                            
                                What is difference between IList and IList<T>
                            
                                WPF Tooltip Show only when Text is something
                            
                                How to hide WinForm after it run? [duplicate]
                            
                                Calling method based on run-time type insead of compile-time type
                            
                                SQLite .NET Insert ExecuteNonQuery Returns 1
                            
                                How to filter a sublist of items
                            
                                Parse a string to a date without slashes
                            
                                Is there any benefit to calling .Any() before .ForEach() when using linq?
                            
                                Select new records by timestamp column with Entity Framework
                            
                                Design Pattern for a multi-step algorithm
                            
                                Invalid Resx file. Could not load type error why?
                            
                                New Methods: Implementing INFINITE LOOP which can be STOPPED upon request
                            
                                Nullable Types in .Net Programming Language [duplicate]
                            
                                How to execute insert query using Entity Framework
                            
                                Convert a literal improperly encoded string (e.g., "Ã±") to ISO-8859-1 (Latin1) H
                            
                                Manage CheckedListBox ItemCheck event to run after an item checked not before
                            
                                How to convert timespan to decimal?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Most efficient way to communicate between multiple .NET apps

Tags:

rest

.net

architecture

azure

microservices