I am a new user of Spark. I have a web service that allows a user to request the server to perform a complex data analysis by reading from a database and pushing the results back to the database. I have moved those analysis's into various Spark applications. Currently I use spark-submit to deploy these applications.
However, I am curious, when my web server (written in Java) receives a user request, what is considered the "best practice" way to initiate the corresponding Spark application? Spark's documentation seems to be to use "spark-submit" but I would rather not pipe out the command to a terminal to perform this action. I saw an alternative, Spark-JobServer, which provides an RESTful interface to do exactly this, but my Spark applications are written in either Java or R, which seems to not interface well with Spark-JobServer.
Is there another best-practice to kickoff a spark application from a web server (in Java), and wait for a status result whether the job succeeded or failed?
Any ideas of what other people are doing to accomplish this would be very helpful! Thanks!
I've had a similar requirement. Here's what I did:
To submit apps, I use the hidden Spark REST Submission API: http://arturmkrtchyan.com/apache-spark-hidden-rest-api
Using this same API you can query status for a Driver or you can Kill your Job later
There's also another hidden UI Json API: http://[master-node]:[master-ui-port]/json/ which exposes all information available on the master UI in JSON format.
Using "Submission API" I submit a driver and using the "Master UI API" I wait until my Driver and App state are RUNNING
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With