Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Load a dataset into R with data() using a variable instead of the dataset name

Tags:

I am trying to load a dataset into R using the data() function. It works fine when I use the dataset name (e.g. data(Titanic) or data("Titanic")). What doesn't work for me is loading a dataset using a variable instead of its name. For example:

# This works fine:
> data(Titanic)

# This works fine as well:
> data("Titanic")

# This doesn't work:
> myvar <- Titanic
> data(myvar)
**Warning message:
In data(myvar) : data set ‘myvar’ not found**

Why is R looking for a dataset named "myvar" since it is not quoted? And since this is the default behavior, isn't there a way to load a dataset stored in a variable?

For the record, what I am trying to do is to create a function that uses the "arules" package and mines association rules using Apriori. Thus, I need to pass the dataset as a parameter to that function.

myfun <- function(mydataset) {
    data(mydataset)    # doesn't work (data set 'mydataset' not found)
    rules <- apriori(mydataset)
}

edit - output of sessionInfo():

> sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arules_1.0-14   Matrix_1.0-12   lattice_0.20-15 RPostgreSQL_0.4 DBI_0.2-7      

loaded via a namespace (and not attached):
[1] grid_3.0.0  tools_3.0.0

And the actual errors I am getting (using, for example, a sample dataset "xyz"):

xyz <- data.frame(c(1,2,3))
data(list=xyz)
Warning messages:
1: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used
2: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used
3: In if (name %in% names(rds)) { :
  the condition has length > 1 and only the first element will be used
4: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used
5: In if (name %in% names(rds)) { :
  the condition has length > 1 and only the first element will be used
6: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used

...

...

32: In data(list = xyz) :
  c("data set ‘1’ not found", "data set ‘2’ not found", "data set ‘3’ not found")
like image 587
pazof Avatar asked Nov 11 '13 18:11

pazof


People also ask

How do I load a dataset to a variable in R?

If you look at the package listing in the Packages panel, you will find a package called datasets. Simply check the checkbox next to the package name to load the package and gain access to the datasets. You can also click on the package name and RStudio will open a help file describing the datasets in this package.

What does data () do in R?

To see the list of available datasets, use data() function. All available datasets in R can be accessed by their explicit names. This function also helps to access build-in datasets from other R packages (special module-based prebuild datasets).

How do I save a data set as a variable in R?

To save data as an RData object, use the save function. To save data as a RDS object, use the saveRDS function. In each case, the first argument should be the name of the R object you wish to save. You should then include a file argument that has the file name or file path you want to save the data set to.

How do you load a data file in RStudio?

In RStudio, click on the Workspace tab, and then on “Import Dataset” -> “From text file”. A file browser will open up, locate the . csv file and click Open. You'll see a dialog that gives you a few options on the import.


1 Answers

Use the list argument. See ?data.

data(list=myvar)

You'll also need myvar to be a character string.

myvar <- "Titanic"

Note that myvar <- Titanic only worked (I think) because of the lazy loading of the Titanic data set. Most datasets in packages are loaded this way, but for other kinds of data sets, you'd still need the data command.

like image 57
Aaron left Stack Overflow Avatar answered Sep 19 '22 08:09

Aaron left Stack Overflow