I have a Akka.NET cluster containing a Lighthouse seed node and two other nodes running actor systems. When I attempt to do a graceful shutdown on one of my cluster nodes I want to see that at least one of the other nodes receives a message about the node leaving and that all cluster nodes eventually exclude the leaving node of the list of nodes.
Once that's been taken care of I expect I should be able to shutdown the node without the two other nodes going nuts about not being able to connect to the node that shut down.
What I have right now is a Console Application wrapped in a TopShelf Application:
class ActorService : ServiceControl
{
private ActorSystem _actorSystem;
public bool Start(HostControl hostControl)
{
_actorSystem = ActorSystem.Create("myActorSystem");
var cluster = Cluster.Get(_actorSystem);
cluster.RegisterOnMemberRemoved(_Terminate);
return true;
}
public bool Stop(HostControl hostControl)
{
var cluster = Cluster.Get(_actorSystem);
cluster.Leave(cluster.SelfAddress);
return true;
}
private void _Terminate()
{
_actorSystem.Terminate();
}
}
Here is my main:
class Program
{
static int Main(string[] args)
{
return (int) HostFactory.Run(x =>
{
x.UseAssemblyInfoForServiceInfo();
x.RunAsLocalSystem();
x.StartAutomatically();
x.Service<ActorService>();
x.EnableServiceRecovery(r => r.RestartService(1));
});
}
}
When stepping through the Stop function, I can't see any received message about the node leaving on the other nodes. When the function returns however, the other nodes start spouting exceptions.
A user in the Akka.NET Gitter channel said:
I have observed the same thing even without TopShelf I must say, with a pure ASP.NET Core project after the webhost terminated.
What can I add to have the other nodes receive a message about the node leaving?
I think the problem is that the Stop()
method completes before the leaving has completed. You should wait for the MemberRemoved event.
This Stop()
method will wait until the MemberRemoved callback has been called and signaled that it even has terminated the actor system.
class Worker
{
private static readonly ManualResetEvent asTerminatedEvent = new ManualResetEvent(false);
private ActorSystem actorSystem;
public void Start()
{
this.actorSystem = ActorSystem.Create("sample");
}
public void Stop()
{
var cluster = Akka.Cluster.Cluster.Get(actorSystem);
cluster.RegisterOnMemberRemoved(() => MemberRemoved(actorSystem));
cluster.Leave(cluster.SelfAddress);
asTerminatedEvent.WaitOne();
//log.Info("Actor system terminated, exiting");
}
private async void MemberRemoved(ActorSystem actorSystem)
{
await actorSystem.Terminate();
asTerminatedEvent.Set();
}
}
Note: I checked for three types of apps how to leave the cluster without problems. I have hosted that on GitHub. There are still some exceptions and a few dead letters when leaving but that the other nodes do no longer continuously try to reconnect to the exited node.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With