I'm bedeviled by a No status is returned. Java SparkR backend might have failed. error when fitting a glm using Spark.  The job actually appears to run to completion based on the Spark web ui, but at some point during model fit (it doesn't appear to be a consistent location), SparkR returns the above error message and then returns to the R REPL.  I'm not seeing a log anywhere I can refer to in order to identify the problem.  Would the Question Answerer point me towards the log, or provide other feedback regarding this problem?
I can see that the error generating code is here.  It looks as though perhaps the connection specified by get(".sparkRCon", .sparkREnv) just isn't there or responds spuriously with an empty string during computation?  I'm at a loss.
I'm on Spark 2.0.0 using Amazon EMR 5.0.
FWIW - my experience with this error suggests that the driver has usually OOMed (though not the only reason for driver failure). The nodes all completed their ops, but the driver failed when compiling the result. Troubleshooting this was not obvious, since SparkR obscures a lot of errors ... I found it by running the same query in pyspark and seeing the driver Java OOM error over there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With