Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to "publish" a large number of actors in CAF?

I've just learned about CAF, the C++ Actor Framework.

The one thing that surprised me is that the way to make an actor available over the network is to "publish" it to a specific TCP port.

This basically means that the number of actors that you can publish is limited by the number of ports you have ( 64k ). Since you need both one port to publish an actor and one port to access a remote actor, I assume that two processes would each be able to share at best about 32k actors each, while they could probably each hold a million actors on a commodity server. This would be even worse, if the cluster had, say, 10 nodes.

To make the publishing scalable, each process should only need to open 1 port, for each and every actor in one system, and open 1 connection to each actor system that they want to access.

Is there a way to publish one actor as a proxy for all actors in an actor system ( preferably without any significant performance loss )?

like image 361
Sebastien Diot Avatar asked Oct 18 '22 05:10

Sebastien Diot


2 Answers

Let me add some background. The middleman::publish/middleman::remote_actor function pair does two things: connecting two CAF instances and giving you a handle for communicating to a remote actor. The actor you "publish" to a given port is meant to act as an entry point. This is a convenient rendezvous point, nothing more.

All you need to communicate between two actors is a handle. Of course you need to somehow learn new handles if you want to talk to more actors. The remote_actor function is simply a convenient way to implement a rendezvous between two actors. However, after you learn the handle you can freely pass it around in your distributed system. Actor handles are network transparent.

Also, CAF will always maintain a single TCP connection between two actor system. If you publish 10 actors on host A and "connect" to all 10 actors from host B via remote_actor, you'll see that CAF will initially open 10 connections (because the target node could run multiple actor system) but all but one connection will get closed.

If you don't care about the rendezvous for actors offered by publish/remote_actor then you can also use middleman::open and middleman::connect instead. This will only connect two CAF instances without exchanging actor handles. Instead, connect will return a node_id on success. This is all you need for some features. For example remote spawning of actors.

Is there a way to publish one actor as a proxy for all actors in an actor system ( preferably without any significant performance loss )?

You can publish one actor at a port that's sole purpose it is to model a rendezvous point. If that actor sends 1000 more actor handles to a remote actor this will not cause any additional network connections.

Writing a custom actor that explicitly models the rendezvous between multiple systems by offering some sort dictionary is the recommended way.

Just for the sake of completeness: CAF also has a registry mechanism. However, keys are limited to atom values, i.e., 10-characters-or-less. Since the registry is generic it also only stores strong_actor_ptr and leaves type safety to you. However, if that's all you need: you put handles to the registry (see actor_system::registry) and then access this registry remotely via middleman::remote_lookup (you only need a node_id to do this).

like image 172
neverlord Avatar answered Nov 01 '22 13:11

neverlord


Smooth scaling with ( almost ) no limits is alpha & omega

One way, used in agent-based systems ( not sure if CAF has implemented tools for going this way ) is to use multiple transport-classes { inproc:// | ipc:// | tcp:// | .. | vmci:// } and thus be able to pick from, on an as needed basis.

While building a proxy may sound attractive, welding together two different actor-models one "atop" the other does not mean that it is as simple to achieve as it sounds ( eventloops are fragile to get tuned / blocking-prevented / event-handled in a fair manner - the do not like any other master to try to take their own Hat ... ).

In case CAF provides at the moment no other transport-means but TCP:

still one may resort to use O/S-level steps and measures and harness the features of the ISO-OSI-model up to the limits or as necessary:

sudo ip address add 172.16.100.17/24 dev eth0

or better, make the additional IP-addresses permanent - i.e. edit the file /etc/network/interfaces ( or Ubuntu ) and add as many stanzas, so that it looks like:

iface eth0 inet static
    address 172.16.100.17/24

iface eth0 inet static
    address 172.16.24.11/24

This way the configuration-space could get extended for cases the CAF does not provide any other means for such actors but the said TCP (address:port#)-transport-class.

like image 21
user3666197 Avatar answered Nov 01 '22 14:11

user3666197