Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Akka cluster-sharding: Can Entry actors have dynamic props

Akka Cluster-Sharding looks like it matches well with a use case I have to create single instances of stateful persistent actors across Akka nodes.

I'm not clear if it is possible though to have an Entry actor type that requires arguments to construct it. Or maybe I need to reconsider how the Entry actor gets this information.

Object Account {
  def apply(region: String, accountId: String): Props = Props(new Account(region, accountId))
}

class Account(val region: String, val accountId: String) extends Actor with PersistentActor { ... }

Whereas the ClusterSharding.start takes in a single Props instance for creating all Entry actors.

From akka cluster-sharding:

val counterRegion: ActorRef = ClusterSharding(system).start(
  typeName = "Counter",
  entryProps = Some(Props[Counter]),
  idExtractor = idExtractor,
  shardResolver = shardResolver)

And then it resolves the Entry actor that receives the message based on how you define the idExtractor. From the source code for shard it can be seen it uses the id as the name for a given Entry actor instance:

def getEntry(id: EntryId): ActorRef = {
val name = URLEncoder.encode(id, "utf-8")
context.child(name).getOrElse {
  log.debug("Starting entry [{}] in shard [{}]", id, shardId)

  val a = context.watch(context.actorOf(entryProps, name))
  idByRef = idByRef.updated(a, id)
  refById = refById.updated(id, a)
  state = state.copy(state.entries + id)
  a
}

}

It seems I should instead have my Entry actor figure out its region and accountId by the name it is given, although this does feel a bit hacky now that I'll be parsing it out of a string instead of directly getting the values. Is this my best option?

like image 981
Rich Avatar asked Oct 20 '14 21:10

Rich


People also ask

What is Akka cluster Sharding?

In this context sharding means that actors with an identifier, so called entities, can be automatically distributed across multiple nodes in the cluster. Each entity actor runs only at one place, and messages can be sent to the entity without requiring the sender to know the location of the destination actor.

What is actor model in Akka?

Akka Actors The Actor Model provides a higher level of abstraction for writing concurrent and distributed systems. It alleviates the developer from having to deal with explicit locking and thread management, making it easier to write correct concurrent and parallel systems.

How does Akka cluster work?

Akka Cluster provides a fault-tolerant decentralized peer-to-peer based Cluster Membership Service with no single point of failure or single point of bottleneck. It does this using gossip protocols and an automatic failure detector.

How does Akka scale?

Akka-based systems can easily scale up, thanks to a lightweight-concurrency model and they can also scale out by adding more cluster nodes for hosting Akka actors. Moreover, because Akka is a unified runtime and programming model, we don't need to involve external services in our Akka ecosystem.


1 Answers

I am in a very similar situation as yours. I don't have an exact answer but I can share with you and the readers what I did/tried/thought.

Option 1) As you mentioned, you can extract id, shard and region information from how you name your stuff and parsing the path. The upside is a) that it's kind of easy to do. The downsides are that a) Akka encodes actor paths as UTF-8, so if you are using anything as a separator that is not a standard url character (such as || or w/e) you will need to first decode it from utf8. Note that inside Akka utf8 is hard-coded as encoding method, there is no way to extract the encoding format as in a function, so if tomorrow akka changes you'll have to adapt your code too. b) your system is not preserving homomorphism anymore (what you mean by "it feels kinda hacky"). Which implies that you are adding the risk that your data, one day, may contain your information separator string as meaningful data and your system may mess up.

Option 2) Sharding will spawn your actor if it doesn't exist. So you can force your code to always send an init message to non initialized actors, which contains your constructor parameters. Your sharded actors will have something inside of them of the kind:

val par1: Option[param1Type] = None

def receive = {
    case init(par1value) => par1 = Some(par1value)
    case query(par1) => sender ! par1
}

And from your region access actor you can always send first the query message and then the init message if the return is None. This assumes that your region access actor does not mantain a list of the initialized actors, in which case you can just spawn with init and then use them normally. The upside is a) It's elegant b) it "feels" right

Downside: a) it takes 2x messages (if you don't maintain a list of initialized actors)

Option 3) THIS OPTION HAS BEEN TESTED AND DOESN'T WORK. I'll just leave it here for people to avoid wasting time trying the same. I have no idea if this works, I haven't tested because I'm using this scenario in production with special constraints and fancy stuff is not allowed ^_^ But feel free to try and please let me know with a pm or comment! Basically, you start your region with

val counterRegion: ActorRef = ClusterSharding(system).start(
  typeName = "Counter",
  entryProps = Some(Props[Counter]),
  idExtractor = idExtractor,
  shardResolver = shardResolver)

What if you, in your region creation actor, do something like:

var providedPar1 = v1
def providePar1 = providedPar1

val counterRegion: ActorRef = ClusterSharding(system).start(
  typeName = "Counter",
  entryProps = Some(Props(classOf[Counter], providePar1),
  idExtractor = idExtractor,
  shardResolver = shardResolver)

And then you change the value of providedPar1 for each creation? The downside of this is that, in the option it works, you'd need to avoid changing the value of providedPar1 until you are 100% sure that the actor has been created, or you may risk it accessing the new, wrong value (yay, race conditions!)

In general you are better off with option 2 imho, but in most scenarios the risks introduced by 1 are small and you can mitigate them properly given the simplicity (and performance) advantages.

Hope this rant helps, let me know if you try 3 out how it works!

like image 124
Diego Martinoia Avatar answered Sep 24 '22 03:09

Diego Martinoia