Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lock or protect a data.table in R

Tags:

r

data.table

Are there one or more ways to lock or protect a data.table such that it can no longer be modified in-place?

Say we have a data.table:

dt <- data.table(id = 1, val="foo")
dt
#    id val
# 1:  1 foo

Can I then modify dt to get the following behavior after?

dt[, val:="bar"]
# error or warning
dt
#    id val
# 1:  1 foo  ## unmodified

Context

This came up because I author a small R package at work that uses data.table extensively. It has some data.tables in it (translation tables) which, if accidentally modified by a user, would cause issues (improper translations). I had hoped that making the data "internal" (as defined here) would solve this but it does not.

Because this is only an issue with data.table objects, I could just use data.frames, copying + casting to data.table as needed within functions. I will go this route if needed (my tables are small enough that the time/memory overhead won't be noticed), but I'm hopeful there's a more natural solution.

like image 493
ClaytonJY Avatar asked Mar 16 '15 19:03

ClaytonJY


1 Answers

Here are a couple of possible ideas.

You could write your own wrapper object (possibly use the R6 package) that defines all the editing tools to give the error and not change the underlying data.table, but uses the standard data.table access functionality for just reading the object.

You could follow the approach of the petals function in the TeachingDemos package.

Both of the above are not perfect and a determined person could still change them. They are probably also not worth the work needed.

You could reread your tables each time your function runs, so that changes would need to be made on the disk, not just in R.

There are tools/packages to compute things like the MD5sums, so you could calculate that for your data.table, then when the code runs you could check the MD5sum and stop if it has changed.

You can have the data.tables saved in a .Rdata style file and attach the file onto the search path rather than load it into the working directory. It could still be changed, but less likely to happen by chance and would require more effort to change (make sure that your code does not access local copies in the global environment (use get or :: or check that a local copy does not exist)).

like image 170
Greg Snow Avatar answered Oct 10 '22 09:10

Greg Snow