I'm trying a random forest classification model by using H2O
library inside R on a training set having 70 million rows and 25 numeric features.The total file size is 5.6 GB.
The validation file's size is 1 GB.
I have 16 GB RAM and 8 core CPU on my system.
The system successfully able to read both of the files in H2O object.
Then I'm giving below command to build the model:
model <- h2o.randomForest(x = c(1:18,20:25), y = 19, training_frame = traindata,
validation_frame = testdata, ntrees = 150, mtries = 6)
But after few minutes (without generating any tree), I'm getting following error:
"Error in .h2o.doSafeREST(conn = conn, h2oRestApiVersion = h2oRestApiVersion, : Unexpected CURL error: Recv failure: Connection reset by peer"
However If I tried above code with 1 tree, its running successfully.
Is the above error occurring because of memory issue? Any help will be appreciated.
Its an OutOfMemoryError. A variation of this error message on the R side is:
Error in .h2o.doSafeREST(conn = conn, h2oRestApiVersion = h2oRestApiVersion, :
Unexpected CURL error: Empty reply from server
Checking the h2o server logs, which you should do as well, will tell you:
10-08 20:11:57.165 192.168.0.4:54321 2125 #58072-18 INFO: Total file size: 1.81 GB
10-08 20:11:57.165 192.168.0.4:54321 2125 #58072-18 INFO: Parse chunk size 4194304
onExCompletion for water.parser.ParseDataset$MultiFileParseTask@3588360e
java.lang.OutOfMemoryError: Java heap space
:
:
Exception in thread "FJ-0-11" java.lang.OutOfMemoryError: Java heap space
2015-10-08 20:13:14.493:WARN:oejut.QueuedThreadPool:1 threads could not be stopped
10-08 20:13:23.033 192.168.0.4:54321 2125 FJ-0-5 ERRR: Out of Memory, Heap Space exceeded, increase Heap Size, from /192.168.0.4:54321
10-08 20:13:23.458 192.168.0.4:54321 2125 FJ-0-3 ERRR: Out of Memory, Heap Space exceeded, increase Heap Size, from /192.168.0.4:54321
10-08 20:13:23.033 192.168.0.4:54321 2125 FJ-0-13 ERRR: Out of Memory, Heap Space exceeded, increase Heap Size, from /192.168.0.4:54321
10-08 20:13:23.033 192.168.0.4:54321 2125 FJ-0-7 ERRR: Out of Memory, Heap Space exceeded, increase Heap Size, from /192.168.0.4:54321
10-08 20:13:26.541 192.168.0.4:54321 2125 FJ-0-5 FATAL: Exiting.
10-08 20:13:26.574 192.168.0.4:54321 2125 FJ-0-7 FATAL: Exiting.
10-08 20:13:26.575 192.168.0.4:54321 2125 FJ-0-3 FATAL: Exiting.
10-08 20:13:26.575 192.168.0.4:54321 2125 FJ-0-13 FATAL: Exiting.
I am running this on h2o Slater (3.2.0.5), so depending on your version, this may vary.
Probably you're out of memory. Try looking on system's memory usage during forest growing. Also try to launch training directly from H2O web console (http://localhost:54321/ by default), may be it will give more detailed error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With