I'm using python libraries of rapids.ai and one of the key things I'm starting to wonder is: how do I inspect memory allocation programatically? I know I can use nvidia-smi to look at some overall high level stats, but specifically I woud like to know:
1) Is there an easy way to find the memory footprint of a cudf dataframe (and other rapids objects?)
2) Is there a way for me to determine device memory available?
I'm sure there are plenty of ways for a C++ programmer to get these details but I'm hoping to find an answer that allows me to stay in Python.
All cudf objects should have the .memory_usage() method:
import cudf
x = cudf.DataFrame({'x': [1, 2, 3]})
x_usage = x.memory_usage(deep=True)
print(x_usage)
Out:
x 24
Index 0
dtype: int64
These values reflect GPU memory used.
You can read the remaining available GPU memory with pynvml:
import pynvml
pynvml.nvmlInit()
handle = pynvml.nvmlDeviceGetHandleByIndex(0) # Need to specify GPU
mem = pynvml.nvmlDeviceGetMemoryInfo(handle)
mem.free, mem.used, mem.total
(33500299264, 557973504, 34058272768)
Most GPU operations require a scratch buffer that is O(N), so you may run into RMM_OUT_OF_MEMORY errors if you end up with DataFrames or Series that are larger than your remaining available memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With