How do we run the notebook from command line?
Further to 1, how would I pass command line arguments into the notebook? I.e. access the command line args from within the notebook code?
After creating a notebook, you can either run the required paragraphs by using the Run option for a paragraph or run all paragraphs by using the Run All Paragraphs option. If your cluster is running Zeppelin 0.8 or a later version, then all the paragraphs of the notebook are run sequentially.
Validating Zeppelin You can also check the status by opening the Zeppelin host with the port number that you configured for it in zeppelin-env.sh in a web browser: for example, http://zeppelin.local:9995 .
Open Zeppelin in your browser by navigating to http://localhost:8080 . In Zeppelin in the browser, open the drop-down menu at anonymous in the upper-right corner of the page, and choose Interpreter. On the interpreters page, search for spark , and choose edit on the right.
So I had the same issue and managed to work out how to use the API to run a notebook using curl. As for passing in command line arguments think there is simply no way to do that - you will have to use some sort of shared state on the server (e.g. have the notebook read from a file, and modify the file).
Anyway this is how I managed to run a notebook, it assumes jq
is installed. Pretty involved :(
curl -XGET http://${ip}:8080/api/interpreter/setting | jq '.body[] | .id'
interpreter_settings_ids=`curl -XGET http://${ip}:8080/api/interpreter/setting | jq '.body[] | .id'`
id_array="["`echo ${interpreter_settings_ids} | tr ' ' ','`"]"
curl -XPUT -d $id_array http://${ip}:8080/api/notebook/interpreter/bind/${notebook_id}
curl -XPOST http://${ip}:8080/api/notebook/job/${notebook_id}
If someone has manually clicked the "save" button for the interpreter binding then only the last command is required.
UPDATE:
OK I think you can loop to probe the status of the running notebook to determine if the notebook failed, see: https://github.com/eBay/Zeppelin/blob/master/docs/rest-api/rest-notebook.md
For example
function job_success {
num_cells=`curl -XGET http://${ip}:8080/api/notebook/job/${notebook_id} 2>/dev/null | jq '.body[] | .status' | wc -l`
num_successes=`curl -XGET http://${ip}:8080/api/notebook/job/${notebook_id} 2>/dev/null | jq '.body[] | .status' | grep FINISHED | wc -l`
test ${num_cells} = ${num_successes}
}
function job_fail {
curl -XGET http://${ip}:8080/api/notebook/job/${notebook_id} 2>/dev/null | jq '.body[] | .status' | grep ERROR
}
until job_success || job_fail
do
sleep 10
done
As of version 0.7.3 and perhaps earlier, Zeppelin has a REST API that lets you run notebooks. Your shell script can use curl to access the API.
The API includes methods to delete a paragraph and to insert a paragraph at a particular index. This allows you to express all your "parameters" as variables in paragraph 0 and then use them in later paragraphs. Make 3 calls to the REST API in this order:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With