I’ve been an h2o user for a little over a year and a half now, but my work has been limited to the R api; h2o flow is relatively new to me. If it's new to you as well, it's basically 0xdata's version of iPython, however iPython let's you export your notebook to a script. I can't find a similar option in flow...
I’m at the point of moving a model (built in flow) to production, and I'm wondering how to automate it. With the R api, after the model was built and saved, I could easily load it in R and make predictions on the new data simply by running a nohup Rscript <the_file> & from CLI, but I’m not sure how I can do something similar with flow, especially since it’s running on Hadoop.
As it currently stands, every run is broken into three pieces with the flow creating a relatively clunky process in the middle:
nslookup the IP address h2o is running on, manually run the flow cell-by-cellThis is a terribly intrusive production process, and I want to tie all the ends up, however flow is making it rather difficult. To distill the question: is there a way to compress the flow into a hadoop jar and then later just run the jar like hadoop jar <my_flow_jar.jar> ...?
Here's the h2o R package documentation. The R API allows you to load an H2O model, so I tried loading the flow (as if it were an H2O model), and unsurprisingly it did not work (failed with a water.api.FSIOException) as it's not technically an h2o model.
This is really late, but (now) h2o flow models have auto-generated java code that represents the trained model (called a POJO) that can be cut and pasted (say from your remote hadoop session to a local java file). See here for a quickstart tutorial on how to use the java object (https://h2o-release.s3.amazonaws.com/h2o/rel-turing/1/docs-website/h2o-docs/pojo-quick-start.html). You'll have to refer to the h2o java api (https://h2o-release.s3.amazonaws.com/h2o/rel-turing/8/docs-website/h2o-genmodel/javadoc/hex/genmodel/easy/EasyPredictModelWrapper.html) to start customizing how you want to use the POJO, but you essentially use it as a black box that makes predictions on properly formated inputs.
Assuming you hadoop session is remote, replace "localhost" in the example with the IP address of your (remote) flow session.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With