Different use case for akka cluster aware router & akka cluster sharding?

Tags:

Cluster aware router:

val router = system.actorOf(ClusterRouterPool(
  RoundRobinPool(0),
  ClusterRouterPoolSettings(
    totalInstances = 20,
    maxInstancesPerNode = 1,
    allowLocalRoutees = false,
    useRole = None
  )
).props(Props[Worker]), name = "router")

Here, we can send message to router, the message will send to a series of remote routee actors.

Cluster sharding (Not consider persistence)

class NewShoppers extends Actor {
  ClusterSharding(context.system).start(
    "shardshoppers",
    Props(new Shopper),
    ClusterShardingSettings(context.system),
    Shopper.extractEntityId,
    Shopper.extractShardId
  )

  def proxy = {
    ClusterSharding(context.system).shardRegion("shardshoppers")
  }

  override def receive: Receive = {
    case msg => proxy forward msg
  }
}

Here, we can send message to proxy, the message will send to a series of sharded actors (a.k.a. entities).

So, my question is: it seems both 2 methods can make the tasks distribute to a lot of actors. What's the design choice of above two? Which situation need which choice?

853

asked Mar 06 '17 06:03

atline

Video Answer

2 Answers

The pool router would be when you just want to send some work to whatever node and have some processing happen, two messages sent in sequence will likely not end up in the same actor for processing.

Cluster sharding is for when you have a unique id on each actor of some kind, and you have too many of them to fit in one node, but you want every message with that id to always end up in the actor for that id. For example modelling a User as an entity, you want all commands about that user to end up with the user but you want the actor to be moved if the cluster topology changes (remove or add nodes) and you want them reasonably balanced across the existing nodes.

168

answered Oct 21 '22 23:10

johanandren

Credit to johanandren and the linked article as basis for the following answer:

Both a router and sharding distribute work. Sharding is required if, additionally to load balancing, the recipient actors have to reliably manage state that is directly associated with the entity identifier.

To recap, the entity identifier is a key, derived from the message being sent, determining the message's receipient actor in the cluster.

First of all, can you manage state associated with an identifier across different nodes using a consistently hashing router? A Consistent Hash router will always send messages with an equal identifier to the same target actor. The answer is: No, as explained below.

The hash-based method stops working when nodes in the cluster go Down or come Up, because this changes the associated actor for some identifiers. If a node goes down, messages that were associated with it are now sent to a different actor in the network, but that actor is not informed about the former state of the actor which it is now replacing. Likewise, if a new node comes up, it will take care of messages (identifiers) that were previously associated with a different actor, and neither the new node or the old node are informed about this.

With sharding, on the other hand, the actors that are created are aware of the entity identifier that they manage. Sharding will make sure that there is exactly one actor managing the entity in the cluster. And it will re-create sharded actors on a different node if their parent node goes down. So using persistence they will retain their (persisted) state across nodes when the number of nodes changes. You also don't have to worry about concurrency issues if an actor is re-created on a different node thanks to Sharding. Furthermore, if a message with a new entity identifier is encountered, for which an actor does not exist yet, a new actor is created.

A consistently hashing router may still be of use for caching, because messages with the same key generally do go to the same actor. To manage a stateful entity that exists only once in the cluster, Sharding is required.

Use routers for load balancing, use Sharding for managing stateful entities in a distributed manner.

answered Oct 21 '22 21:10

ig-dev

Related questions
                            
                                Scala Future - for comprehension, mix sync and async
                            
                                Consume TCP stream and redirect it to another Sink (with Akka Streams)
                            
                                Unable to create dataframe from RDD of Row using case class
                            
                                mapping over zipped HLists
                            
                                How does node know which nodes have seen the cluster current state?
                            
                                how to configure a Akka Pub/Sub to run on same machine?
                            
                                Spark 2.0 Dataset Encoder with trait
                            
                                How to add a condition count > 8 in Gatling script?
                            
                                Is this a bug of scala's specialized?
                            
                                cast schema of a data frame in Spark and Scala
                            
                                Scala & Spark: Dataframe.write._ on Windows
                            
                                ERROR ContextCleaner: Error in cleaning thread
                            
                                Adding Spark "Library" to a Scala project
                            
                                How to get values from query result in slick
                            
                                Kafka Streams 0.10.1 "Failed to flush state store"
                            
                                Akka flow for multiple http requests
                            
                                Monads as Monoids in practice
                            
                                Play framework 2.5 logs `?` question marks instead of line numbers
                            
                                How to use Futures with Kafka Streams
                            
                                How to use Apply for Function Application

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Different use case for akka cluster aware router & akka cluster sharding?

Tags:

scala

akka

akka-cluster

akka-persistence