Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data Deprecation in R-package

In one of my R-packages I have some silly example data that I would like to remove. For that I'd like to follow the common way first to deprecate and then defunct it.

For removing functions from an R-package I found a way like this:

oldFunc <- function()
{
    .Deprecated("newFunc")
}

followed by (lets say 6 months)

oldFunc <- function()
{
    .Defunct("newFunc")
}

And then after another 6 months I could delete the function from the package.

However, how to remove a data object stored as /data/myData.rda in the package and that has also some myData.Rd description?

like image 907
Daniel Fischer Avatar asked Oct 23 '15 14:10

Daniel Fischer


People also ask

What is deprecated in R?

When an object is about to be removed from R it is first deprecated and should include a call to . Deprecated .

What does data deprecation mean?

Data deprecation, which is the process of limiting advertisers' use of data management platforms for advertising purposes, is most often associated with browser and operating system restrictions — such as changes to third-party cookies or mobile ad IDs.


2 Answers

Good question, unfortunately I did not find an answer.
So I'm sharing what I have drafted to solve this case. I know it's not perfect, however I Hope it will be useful and / or improved.

Update

So after a first draft (see below), I've applied a process that seems to be a reasonable--good enough--solution.

1. Move the data file

The first step is to move the data file from its default location ./data to another location in order to avoid its automatic loading--even if it's a lazy loading.

The target location is ./data-raw it's a directory that is used by convention to store script and raw data in order to be able to update or reproduce the production of the exported dataset--more on that in the Data chapter of the book R packages.

I'm using by convention a leg_ prefix to flag it as a legacy dataset.

$ mv ./data/my_data.rda ./data-raw/leg_my_data.rda

2. Write a script to transform the dataset

The code used to transform the dataset from its legacy to its new format is stored along with the legacy dataset in ./data-raw/my_data.R. This will make the whole process reproducible.

# my_data new version

library(tidyverse)

# Load legacy data -----
load("data-raw/leg_my_data.rda")
leg_my_data <- my_data

# Create the new dataset -----
# Perform here every change that has to be performed
my_data <- leg_my_data %>%
  rename(cat = categ) %>%
  arrange(categ)

# Write the new dataset ----
usethis::use_data(my_data, overwrite = TRUE, compress = 'xz')

Source the file and you're good the new version is live!

source('./data-raw/my_data.R', echo=TRUE)
# ✓ Saving 'my_data' to 'data/my_data.rda'
# ● Document your data (see 'https://r-pkgs.org/data.html')

my_data
# A tibble: 10 x 2
#   categ   val
#   <fct> <int>
# 1 a         9
# 2 a         6
# 3 a         4

3. Secret sauce

In the ./R/my_package-package.R file, create a legacy_mode function. This function will be a way for the users to load the previous (legacy) version of the datasets if they need to use them for compatibility reason.

#' Load legacy version of datasets.
#'
#' Load legacy (previous) version of all the datasets for compatibility reason.
#' The environment where data will be loaded can be chosen.
#'
#' @param envdir the environment where the data should be loaded.
#' @param verbose should item names be printed during loading?
#'
#' @export
legacy_mode <- function(envdir = parent.frame(), verbose = TRUE) {
  .Deprecated(msg = "This function replaces datasets with the previous (legacy) version for compatibility reason")
  # TODO: To be improved to load a subset of datasets
  paths <- sort(Sys.glob(c("data-raw/leg_*.rda", "data-raw/leg_*.RData")))
  for (i in 1:length(paths)) {
    load(paths[i], envir = envdir, verbose = verbose)
  }
}

4. Result

And so now you have access to both the new version of the dataset available by default and the legacy version if needed for compatibility reasons. If the legacy data is used, a proper deprecation message is displayed.

# The current version
head(my_data, 3)
# A tibble: 3 x 2
  categ   val
  <fct> <int>
1 a         9
2 a         6
3 a         4

# Activation of the legacy mode
legacy_mode()
# Loading objects:
#   my_data
# Warning message:
# This function replaces datasets with the previous (legacy) version for # compatibility reason 

# Legacy version
head(my_data, 3)
# A tibble: 3 x 2
#   cat     val
#   <fct> <int>
# 1 a         9
# 2 c         2
# 3 b         3

Do not forget to document your changes by updating the dataset documentation in R/my_data.R. You can mention in a note the legacy mode.

Note: I've also written a blog post on this topic with a little more content.

First draft

1. Rename the deprecated data

The objective is to move the previous data to a file called prefixed with dep_. The new data will then replace it.

# Moving the deprecated data prefixed with dep_
dep_my_data <- my_data
usethis::use_data(dep_my_data)

# Overwriting data with the new version of the dataset
my_data <- new_data
usethis::use_data(my_data, overwrite = TRUE)

2. Declare deprecation in the documentation

#' MyData package
#'
#' Note: this dataset is the new version. If you want to use the old one for compatibility reason,
#' please use instead \code{\link{dep_my_data}}.
#'
#' @docType data
#'
#' @rdname dep_my_data
"mydata"

#' [Deprecated] MyData package
#'
#' Note: this dataset still exist but will be removed (defunct) in the next version.
#' Please use instead \code{\link{my_data}}.
#'
#' @docType data
#'
#' @rdname dep_my_data
"dep_mydata"

3. Result

> data()

# dep_mydata   [Deprecated] MyData package
# mydata       MyData package
like image 59
Romain Avatar answered Oct 09 '22 19:10

Romain


In addition to using .Deprecated and .Defunct you should also

  • remove the data set from data/. Users can obtain archived versions from CRAN
  • add a note in the NEWS/CHANGELOG

To be helpful for future readers, the answer is a summary of comments under the original question.

like image 30
csgillespie Avatar answered Oct 09 '22 17:10

csgillespie