I am running h2o through Rstudio Server on a linux server with 64 GB of RAM. When I initialize the cluster it says that the total cluster memory is only 9.78 GB. I have tried using the max_mem_size parameter but still only using 9.78 GB.
localH2O <<- h2o.init(ip = "localhost", port = 54321, nthreads = -1, max_mem_size = "25g")
H2O is not running yet, starting it now...
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
Connection successful!
R is connected to the H2O cluster:
H2O cluster uptime: 5 hours 10 minutes
H2O cluster version: 3.10.4.6
H2O cluster version age: 19 days
H2O cluster name: H2O_started_from_R_miweis_mxv543
H2O cluster total nodes: 1
H2O cluster total memory: 9.78 GB
H2O cluster total cores: 16
H2O cluster allowed cores: 16
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
R Version: R version 3.3.3 (2017-03-06)
I ran the following on the server to insure the amount of memory available:
cat /proc/meminfo
MemTotal: 65806476 kB
EDIT:
I was looking more into this issue and it seems like it is a default within the JVM. When I started h2o directly in Java I was able to pass in the command -Xmx32g
and it did increase the memory. I could then connect to that h2o instance in Rstudio and have access to the increases memory. I was wondering if there was a way to change this default value in the JVM and allow more memory so I don't have to first start the h2o instance from the command line then connect to it from Rstudio server.
By default, h2o. init() first checks if an H2O instance is connectible. If it cannot connect and start = TRUE with ip = "localhost" , it will attempt to start an instance of H2O at localhost:54321.
It basically means all the computations, data and everything involved in machine learning happens in the distributed memory of the H2O cluster itself. You can think of a cluster like a bunch of nodes, sharing memory and computation. A Node could be a server, an EC2 instance, or your laptop.
The max_mem_size
argument in the h2o R package is functional, so you can use it to start an H2O cluster of whatever size you want -- you don't need to start it from the command line using -Xmx
.
What's seems to be happening in your case is that you are connecting to an existing H2O cluster located at localhost:54321
that was limited to "10G" (in reality, 9.78 GB). So when you run h2o.init()
from R, it will just connect to the existing cluster (with a fixed memory), rather than starting a new H2O cluster with the memory that you specified in max_mem_size
, and so the memory request gets ignored.
To fix, you should do one of the following:
localhost:54321
and restart from R with the desired memory requirement, or When starting up h2o.init()
want to specify the argument min_mem_size=
This forces H2O to use at least that amount of memory. max_mem_size=
prevents H2O from using more than that amount of memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With