Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: reducing digits/precision for saving RAM?

Tags:

r

I am running out of RAM in R with a data.table that contains ~100M rows and 40 columns full of doubles. My naive thought was that I could reduce the object size of the data table by reducing the precision. There is no need for 15 digits after the comma. I played around by rounding, but as we know

round(1.68789451154844878,3)

gives

 1.6879999999999999

and does not help. Therefore, I transformed the values to integers. However, as the small examples below show for a numeric vector, there is only a 50% reduction from 8000040 bytes to 4000040 bytes and this reduction does not increase any more when reducing the precision further.

Is there a better way to do that?

set.seed(1)
options(digits=22)

a1 = rnorm(10^6)
a2 = as.integer(1000000*(a1)) 
a3 = as.integer(100000*(a1)) 
a4 = as.integer(10000*(a1)) 
a5 = as.integer(1000*(a1)) 

head(a1)
head(a2)
head(a3)
head(a4)
head(a5)

give

[1] -0.62645381074233242  0.18364332422208224 -0.83562861241004716  1.59528080213779155  0.32950777181536051 -0.82046838411801526
[1] -626453  183643 -835628 1595280  329507 -820468
[1] -62645  18364 -83562 159528  32950 -82046
[1] -6264  1836 -8356 15952  3295 -8204
[1] -626  183 -835 1595  329 -820

and

object.size(a1)
object.size(a2)
object.size(a3)
object.size(a4)
object.size(a5)

give

8000040 bytes
4000040 bytes
4000040 bytes
4000040 bytes
4000040 bytes
like image 928
HOSS_JFL Avatar asked Nov 09 '22 11:11

HOSS_JFL


1 Answers

Not as such, no. In R, an integer takes 4 bytes and a double takes 8. If you are allocating space for 1M integers you perforce are going to need 4M bytes of RAM for the vector of results.

like image 53
Avraham Avatar answered Nov 15 '22 07:11

Avraham