Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass string as name of attached data column name

Tags:

r

I know that one can pass strings as variable names using the eval(parse()) and as.names() functions. But my problem is a bit different.

I have a string that contains both the data and column name, for example the string: data1$column2. When I try the mentioned commands I get a variable not found error for the variable data1$column2. The variable is itself is of course called data1 and can thus not be found as R interprets the whole string as a variable name.

How do I get the $-sign working as a column reference? Some kind of paste-as-text-command would be great, too. That is, if I just could pass the string as a literal part of my console input.

EXAMPLE

attach(iris)
col_names <- cbind("iris$Sepal.Length", "iris$Sepal.Width")
col_names

Now I want to do:

"as.data.frame(parse(col_names))"

That is, to be interpreted as:

as.data.frame(cbind(iris$Sepal.Length, iris$Sepal.Width))
like image 636
Joshua Avatar asked Oct 14 '25 04:10

Joshua


1 Answers

Summary

In light of the various changes to the detail of the question, here are two solutions to the problem that can be phrased as:

Given

col_names <- c("Obj1$Var1", "Obj2$Var2")

how to return a data frame that would be the equivalent of

cbind(Obj1$Var1, Obj2$Var2)

?

The simplest solution would be

as.data.frame(sapply(col_names, function(x) eval(parse(text = x))))

but that uses parse() which shouldn't be relied on for things like this. An alternative, but somewhat longer solution is

get4 <- function(x, ...) {
  fun <- function(text, ...) {
    obj <- get(text[1], ...)
    obj[[text[2]]]
  }
  sx <- strsplit(x, "\\$")
  lx <- lapply(sx, fun, ...)
  out <- do.call(cbind.data.frame, lx)
  names(out) <- x
  out
}

get4(col_names)

The second solution has advantages, despite being somewhat longer, in that it

  1. will work for data of different types as it works with a list and converts that to a data frame. The eval(parse(text = ....)) solution simplifies to an array first. Using lapply() instead of sapply() is an option that gets round this, but needs extra work to change the names of the resulting object.
  2. uses common function get() to grab the object with stated name, and basic subsetting syntax.
  3. doesn't use parse ;-)

Original Answer

The original Answer with greater detail continues below:

eval(parse(....)) will work

data1 <- data.frame(column1 = 1:10, column2 = letters[1:10])
txt <- "data1$column2"

> eval(parse(text = txt))
 [1] a b c d e f g h i j
Levels: a b c d e f g h i j

As @texb mentions, this can trivially be extended to handle a vector of strings via (modified to return a data frame)

col_names <- c("iris$Sepal.Length", "iris$Sepal.Width")
as.data.frame(sapply(col_names, function(x) eval(parse(text = x))))

It may be more acceptable to use get but you'll have to do a bit of precessing, something along the lines of

get2 <- function(x, ...) {
  sx <- strsplit(x, "\\$")[[1]]
  obj <- get(sx[1], ...)
  obj[[sx[2]]]
}

> get2(txt)
 [1] a b c d e f g h i j
Levels: a b c d e f g h i j

iris example from OP's question

As @texb mentions, the eval(parse(text = ....)) version can trivially be extended to handle a vector of strings via (modified to return a data frame)

col_names <- c("iris$Sepal.Length", "iris$Sepal.Width")
as.data.frame(sapply(col_names, function(x) eval(parse(text = x))))

  iris$Sepal.Length iris$Sepal.Width
1               5.1              3.5
2               4.9              3.0
3               4.7              3.2
4               4.6              3.1
5               5.0              3.6
6               5.4              3.9
....

Modifiying get2() is also possible to allow it to work on a vector of strings such as col_names. Here I loop over the first elements of sx to extract the object string (checking that there is only one unique object name), then I get that object and then subset it using the variable names (extracted using sapply(sx, `[`, 2))

get3 <- function(x, ...) {
  sx <- strsplit(x, "\\$")
  obj <- unique(sapply(sx, `[`, 1))
  stopifnot(length(obj) == 1L)
  obj <- get(obj, ...)
  obj[sapply(sx, `[`, 2)]
}

col_names <- c("iris$Sepal.Length", "iris$Sepal.Width")
head(get3(col_names))

> head(get3(col_names))
  Sepal.Length Sepal.Width
1          5.1         3.5
2          4.9         3.0
3          4.7         3.2
4          4.6         3.1
5          5.0         3.6
6          5.4         3.9

If you have multiple objects referenced in col_names then you will need a different solution, along the lines of

get4 <- function(x, ...) {
  fun <- function(text, ...) {
    obj <- get(text[1], ...)
    obj[[text[2]]]
  }
  sx <- strsplit(x, "\\$")
  lx <- lapply(sx, fun, ...)
  out <- do.call(cbind.data.frame, lx)
  names(out) <- x
  out
}

col_names2 <- c("iris$Sepal.Length", "iris2$Sepal.Length")
get4(col_names2)

> head(get4(col_names2))
  iris$Sepal.Length iris2$Sepal.Length
1               5.1                5.1
2               4.9                4.9
3               4.7                4.7
4               4.6                4.6
5               5.0                5.0
6               5.4                5.4
like image 146
Gavin Simpson Avatar answered Oct 16 '25 18:10

Gavin Simpson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!