Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to communicate between multiple .NET apps

Currently i have a setup where my clients (web apps, iOS app etc) talks to my backend API .NET web app (Nancy) via REST calls. Nothing special.

I now have a requirement to split this API up into microservices, where each service can be individually upgraded/deployed.

My main API (public) will just perform authentication, then call into one of my microservices, which will be in my private network.

What's the different ways i could communicate between my main API and other microservice API's? Pros/cons of each approach?

The communication needs to be realtime - e.g request comes in from a browser/device, main API performs auth, then calls into microservice API then returns response. So i can't use things like queues or pub/sub. It doesn't necessarily need to use HTTP, but it needs to be realtime communication (request/response). I also have other services (WebJobs, cloud services, etc) that need to talk to these microservices (they are also in the private network).

The only approach that comes to mind is simple REST-based calls. Totally fine, but latency is the main issue here.

Can anyone recommend any other solutions to this problem? Is there anything in Azure suited to this?

Many thanks

like image 789
RPM1984 Avatar asked Feb 24 '16 21:02

RPM1984


People also ask

How do you communicate between services in two different applications?

Using Bounded Services that uses either a Messenger object to pass messages between the local process and the Remote Bounded Service or using AIDL to create an interface that will be passed from the Remote Bounded Service to the local process so that they can communicate.

How do you communicate with multiple microservices?

Because microservices are distributed and microservices communicate with each other by inter-service communication on network level. Each microservice has its own instance and process. Therefore, services must interact using an inter-service communication protocols like HTTP, gRPC or message brokers AMQP protocol.

What is the best way to communicate between microservices?

The most common type is single-receiver communication with a synchronous protocol like HTTP/HTTPS when invoking a regular Web API HTTP service. Microservices also typically use messaging protocols for asynchronous communication between microservices.

How do microservices communicate with each other in net core?

Messaging Integration Unlike RESTful Web API endpoints or gRPC integration, the messaging pattern allows microservices to communicate with each other indirectly. Communication passes through a message bus component that acts as an intermediary.


1 Answers

First - clarify your distinctions between "real-time", "synchronous/asynchronous" and "one-way/two-way". The things you rule out (queues, and pub/sub) can certainly be used for two-way request/response, but they are asynchronous.

Second - clarify "efficiency" - efficiency on what metric? Bandwidth? Latency? Development time? Client support?

Third - realize that (one of) the costs of microservices is latency. If that's an issue for you at your first integration, you're likely in for a long road.

What's the different ways i could communicate between my main API and other microservice API's? Pros/cons of each approach?

Off the top of my head:

  • Host on single node(s) and IPC: pros: performance; cons: tight coupling to deployment topology; reachability to other nodes; client support; failure modes
  • REST/SOAP/etc. endpoints: pros: wide client and server support; debugging; web caching, cons: performance; failure modes
  • Binary protocol: pros: performance; cons: versioning; client and server support; debugging; failure modes
  • Messaging Queues: pros: asynchronicity; cons: complexity
  • ESB: pros: asynchronicity; client support; cons: complexity
  • Flat files: pros: bandwidth; simplicity; client support; cons: latency, general PITA

You'll note that this is the same list when we tie multiple applications together...because that's what you're doing. Just cause you made the applications smaller doesn't really change much except makes your system even more distributed. Expect to solve all the same problems "normal" distributed systems have, and then a few extra ones related to deployment and versioning.

Consider an idempotent GET request from a user like "Get me question 1". That client expects a JSON response of question 1. Simple. In my expected architecture, the client would hit api.myapp.com, which would then proxy a call via REST to question-api.myapp.com (microservice) to get the data, then return to user. How could we use pub/sub here? Who is the publisher, who is the subscriber? There's no event here to raise. My understanding of queues: one publisher, one consumer. Pub/sub topic: one publisher, many consumers. Who is who here?

Ok - so first, if we're talking about microservices and latency - we're going to need a more representative example. Let's say our client is the Netflix mobile app, and to display the opening screen it needs the following information:

  1. List of trending movie ids
  2. List of recently watched movie ids
  3. Account status
  4. For each movie id referenced: the name, # of stars, summary text
  5. List of movie ids not available in your region (to be filtered out of trending/recently watched)

Each one of those is provided by a different microservice (we'll call them M1-M5). Each call from client -> datacenter has 100ms expected latency; calls between services have 20ms latency.

Let's compare some approaches:

1: Monolithic service

  1. T0 + 100ms: Client sends request for /API//StartScreen; receives response

As expected, that's the lowest latency option - but requires everything in a monolithic service, which we've decided we don't want because of operational concerns.

2: Microservices

  1. T0 + 100ms: Client sends request to M1 /API/Trending
  2. T1 + 100ms: Client sends request to M2 /API//Recent
  3. T2 + 100ms: Client sends request to M3 /API//Account
  4. T3 + 100ms: Client sends request to M4 /API/Summary?movieids=[]
  5. T4 + 100ms: Client sends request to M5 /API//Unavailable

That's 500ms. Using a proxy w/this isn't going to help - it'll just add 20ms latency to each request (making it 600ms). We have a dependency between 1 + 2 and 4, and 3 and 5, but can do some async. Let's see how that helps.

3: Microservices - Async

  1. T0: Client sends request to M1 /API/Trending
  2. T0: Client sends request to M2 /API//Recent
  3. T0: Client sends request to M3 /API//Account
  4. T0 + 200ms: (response from 1 + 2) Client sends request to M4 /API/Summary?movieids=[]
  5. T0 + 200ms: (response from 3) Client sends request to M5 /API//Unavailable

We're down to 200ms; not bad - but our client needs to know about our microservice architecture. If we abstract that with our proxy, then we have:

4: Microservices - Async w/Gateway

  1. T0 + 100ms: client sends request to G1 /API/StartScreen
  2. T1: G1 sends request to M1 /API/Trending
  3. T1: G1 sends request to M2 /API//Recent
  4. T1: G1 sends request to M3 /API//Account
  5. T1 + 40ms: (response from 1 + 2) G1 sends request to M4 /API/Summary?movieids=[]
  6. T1 + 40ms: (response from 3) G1 sends request to M5 /API//Unavailable

Down to 140ms, since we're leveraging the decreased intra-service latency.

Great - when things are working smoothly, we've only increased latency by 40% compared to monolithic (#1).

But, as with any distributed system, we also have to worry about when things aren't going smoothly.

What happens when M4's latency increases to 200ms? Well, in the client -> async microservice route (#3), then we have partial page results in 100ms (the first batch of requests), unavailable in 200ms and summaries in 400ms. In the proxy case (#4), we have nothing until 340ms. Similar considerations if a microservice is completely unavailable.

Queues are a way of abstracting producer/consumer in space and time. Let's see what happens if we introduce one:

5: Microservice with async pub/sub

  1. T0 + 100ms: client publishes request Q0 to P1 StartScreen, with a reply channel of P2
  2. T1 + 20ms: M1 sees Q0, puts response R1 to P2 Trending/Q0
  3. T1 + 20ms: M2 sees Q0, puts response R2 to P2 Recent/Q0
  4. T1 + 20ms: M3 sees Q0, puts response R3 to P2 Account/Q0
  5. T2 + 40ms: M4 sees R1, puts response R4a to P2 Summary/Q0
  6. T2 + 40ms: M4 sees R2, puts response R4b to P2 Summary/Q0
  7. T2 + 40ms: M5 sees R3, puts response R5 to P2 Unavailable/Q0

Our client, who is subscribed to P2 - receives partial results w/a single request and is abstracted away from the workflow between M1 + M2 and M4, and M3 and M5. Our latency in best case is 140ms, same as #4 and in worst case is similar to the direct client route (#3) w/partial results.

We have a much more complicated internal routing system involved, but have gained flexibility w/microservices while minimizing the inevitable latency. Our client code is also more complex - since it has to deal with partial results - but is similar to the async microservice route. Our microservices are generally independent of each other - they can be scaled independently, and there is no central coordinating authority (like in the proxy case). We can add new services as needed by simply subscribing to the appropriate channels, and having the client know what to do with the response we generate (if we generate one for client consumption of course).

6 Microservices - Gateway with Queues

You could do a variation of using a gateway to aggregate responses, while still using queues internally. It would look a lot like #4 externally, but #5 internally. The addition of a queue (and yes, I've been using queue, pub/sub, topic, etc. interchangeably) still decouples the gateway from the individual microservices, but abstracts out the partial result problem from the client (along w/its benefits, though).

The addition of a gateway, though, does allow you to handle the partial result problem centrally - useful if it's complex, ever changing, and/or reimplemented across multiple platforms.

For instance, let's say that, in the event that M4 (the summary service) is unavailable - we have a M4b that operates on cached data (so, eg., the star rating is out of date). M4b can answer the R4a and R4b immediately, and our Gateway can then determine if it should wait for M4 to answer or just go w/M4b based on a timeout.

For further info on how Netflix actually solved this problem, take a look at the following resources:

  1. Their API Gateway
  2. The messaging patterns underlying their fault-tolerance layer
  3. More about IPC in microservices
like image 198
Mark Brackett Avatar answered Sep 26 '22 10:09

Mark Brackett