Consider classical "Word Count" program. It counts number of words in all files in some directory. Master receives some directory and splits job among Worker actors (each worker works with one file). This is pseudo-code:
class WordCountWorker extends Actor {
def receive = {
case FileToCount(fileName:String) =>
val count = countWords(fileName)
sender ! WordCount(fileName, count)
}
}
class WordCountMaster extends Actor {
def receive = {
case StartCounting(docRoot) => // sending each file to worker
val workers = createWorkers()
fileNames = scanFiles(docRoot)
sendToWorkers(fileNames, workers)
case WordCount(fileName, count) => // aggregating results
...
}
}
But I want to run this Word Count program by schedule ( for example every 1 minute), providing different directories to scan.
And Akka provides nice way for scheduling message passing:
system.scheduler.schedule(0.seconds, 1.minute, wordCountMaster , StartCounting(directoryName))
But the problem with above scheduler starts when scheduler sends new message by tick, but previous message is not yet processed (for example I sent message to scan some big directory, and after 1 second I sent another message to scan another directory, so operation of processing of 1st directory is not completed yet). So as a result my WordCountMaster
will receive WordCount
messages from workers which are processing different directories.
As a workaround instead of scheduling message sending, I can schedule execution of some code block, which will create every time new WordCountMaster
. I.e. one directory = one WordCountMaster
. But I think it's inefficient, and also I need care about providing unique names for WordCountMaster
to avoid InvalidActorNameException
.
So my question is: should I create new WordCountMaster
for each tick as I mentioned in above paragraph? Or there some better ideas/patterns how to redesign this program to support scheduling?
Some update: In case of creating one Master actor per directory, I have some problems:
InvalidActorNameException: actor name [WordCountMaster] is not unique!
and
InvalidActorNameException: actor name [WordCountWorker ] is not unique!
I can overcome this problem just not providing actor name. But in this case my actors receives auto-generated names, like $a
, $b
etc. It's not good for me.
I want to exclude configuration of my routers to application.conf
. I.e. I want to provide same configuration to each WordCountWorker
router. But since I'm not controlling actor names I can't use configuration below because I don't know actor names:
/wordCountWorker{
router = smallest-mailbox-pool
nr-of-instances = 5
dispatcher = word-counter-dispatcher
}
Akka Cluster provides a fault-tolerant decentralized peer-to-peer based Cluster Membership Service with no single point of failure or single point of bottleneck. It does this using gossip protocols and an automatic failure detector.
What is an Actor in Akka? An actor is essentially nothing more than an object that receives messages and takes actions to handle them. It is decoupled from the source of the message and its only responsibility is to properly recognize the type of message it has received and take action accordingly.
Interacting with an Actor in Akka is done through an ActorRef[T] where T is the type of messages the actor accepts, also known as the “protocol”. This ensures that only the right kind of messages can be sent to an actor and also that no one else but the Actor itself can access the Actor instance internals.
I am not an Akka expert, but I think the approach of having an actor per aggregation is not inefficient. You need to keep the concurrent aggregations separaeted somehow. You can either give every aggregation an id so keep them separated by the id in the one and only master actor, or you can use the Akka actor naming and live cycle logic, and delegate every aggregation for every counting round to an actor that will live just for that aggregation logic.
For me the usage of one actor per aggregation seems to be more elegant.
Also please note that Akka has an implementation for the aggregation pattern as described here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With