My use case is the following. An application on a machine connects to remote machines, executes scripts on them and brings back the result. I am using Akka Framework for remoting and Play Framework for the client application. The code of the server running on my remote machine is as follows :
public static void main(String[] args)
{
OnCallServer app = new OnCallServer();
app.executeServer();
}
private void executeServer() {
ActorSystem system = ActorSystem.create("OnCallServer");
}
( just starts an instance of the actor system on the remote machine )
Now, when the client application wants to run a script on the remote machine, it deploys an actor on this remote system which executes the script.
The code of the actor which gets deployed is as follows :
public static class RemoteActor extends UntypedActor implements Serializable {
private static final long serialVersionUID = 1L;
@Override
public void onReceive(Object message) throws Exception {
Config config = context().system().settings().config();
String host = config.getConfig("akka.remote.netty.ssl").getString("machineName");
String sysDesc = host;
if (message instanceof ScriptExecutionParams) {
System.out.println("scriptParam");
ScriptExecutionParams scriptParams = (ScriptExecutionParams) message;
if (scriptParams.function == ScriptFunction.EXECUTE) {
getSender().tell(executeScript(scriptParams.getName(), scriptParams.getArgument(), sysDesc), getSelf());
} else if (scriptParams.function == ScriptFunction.DEPLOY) {
getSender().tell(deployScript(scriptParams.getName(), scriptParams.getContent(), sysDesc), getSelf());
} else if (scriptParams.function == ScriptFunction.REMOVE) {
getSender().tell(removeScript(scriptParams.getName(), sysDesc), getSelf());
}
}
}
}
( gets script parameters, performs the desired function, returns the result )
I am using TCP connection over SSL for remoting. The config is as follows :
remote {
enabled-transports = ["akka.remote.netty.ssl"]
netty.ssl {
hostname = "localhost" (for client) and hostname (for remote servers)
port = 10174 (for client) and 10175 ( for server )
enable-ssl = true
}
netty.ssl.security {
key-store = "clientKeystore.jks"
trust-store = "clientTruststore.jks"
key-store-password = "xxx"
key-password = "xxx"
trust-store-password = "xxx"
protocol = "SSLv3"
enabled-algorithms = [SSL_RSA_WITH_NULL_SHA]
random-number-generator = ""
}
}
This setup works perfectly but sometimes the the remote machine becomes unreachable. I have noticed this happening in two cases :
Now the things which are confusing me are that :
I have tried adding a supervisorStrategy in the client actor, but it doesn't have any effect. Am I doing something wrong ? If the TCP connection is the problem, is there a way to terminate the connection after each execution ? If the problem is Actor System shutting down if not touched for a long time, is there a config to change this ? Please ask if you need more code or information.
Update
When I try restarting the client when testing on my local machine, it doesn't give any problem. the remote server just throws akka.remote.EndpointAssociationException messages but reconnects and is able to send replies. It is only in the production mode, when the apps are deployed on separate machines that this problem arises. I think my client is getting quarantined on restart and akka.remote.quarantine-systems-for has been removed in the new Akka version.
Ok, I found out the problem. For anyone else who might face this problem: In the config files of the remote machines, in the netty.ssl part of the config, I used to give their respective hostnames as I used this in the client application for connection. But in the client application config I used to give the hostname as "localhost" as I thought I would not be needing this anywhere.
Now, checking the logs in DEBUG mode, I found out that when the initial connection was established, the association was as follows:
2014-05-01 18:35:38.503UTC DEBUG[OnCallServer-akka.actor.default-dispatcher-3] Remoting - Associated [akka.ssl.tcp://[email protected]:10175] <- [akka.ssl.tcp://application@localhost:10174]
even though the client app was not on the machines localhost.. Now this session didn't give any errors. But after the connection was lost ( after restarting the client app ), and I tried re executing the script, I got the logs :
2014-05-01 18:36:12.045UTC ERROR[OnCallServer-akka.actor.default-dispatcher-2] a.r.EndpointWriter - AssociationError [akka.ssl.tcp://[email protected]:10175] -> [akka.ssl.tcp://application@localhost:10174]: Error [Association failed with [akka.ssl.tcp://application@localhost:10174]] [ akka.remote.EndpointAssociationException: Association failed with [akka.ssl.tcp://application@localhost:10174] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: localhost/127.0.0.1:10174
The server app was for some reason trying to send this message back to it's localhost.
Changing the hostname in the client config to it's actual hostname solved the problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With