Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accessing Arbitrary Columns from an R Data Frame using with()

Tags:

dataframe

r

Suppose that I have a data frame with a column whose name is stored in a variable. Accessing this column using the variable is easy using bracket notation:

df <- data.frame(A = rep(1, 10), B = rep(2, 10))
column.name <- 'B'

df[,column.name]

But it is not obvious how to access an arbitrary column using a call to with(). The naive approach

with(df, column.name)

effectively evaluates column.name in the caller's environment. How can I delay evaluation sufficiently that with() will provide the same results that brackets give?

like image 249
johnmyleswhite Avatar asked Apr 10 '10 18:04

johnmyleswhite


People also ask

How do you access columns from a Dataframe in R?

To access a specific column in a dataframe by name, you use the $ operator in the form df$name where df is the name of the dataframe, and name is the name of the column you are interested in. This operation will then return the column you want as a vector.

How do I select certain columns from a Dataframe in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.

How do you access columns in R?

The column items in a data frame in R can be accessed using: Single brackets [] , which would display them as a column. Double brackets [[]] , which would display them as a list. Dollar symbol $ , which would display them as a list.


2 Answers

You can use get:

with(df, get(column.name))
like image 103
Eduardo Leoni Avatar answered Oct 15 '22 19:10

Eduardo Leoni


You use 'with' to create a localized and temporary namespace inside which you evaluate some expression. In your code above, you haven't passed in an expression.

For instance:

data(iris)   # this data is in your R installation, just call 'data' and pass it in

Ordinarily you have to refer to variable names within a data frame like this:

tx = tapply(iris$sepal.len, list(iris$species), mean)

Unless you do this:

attach(iris)

The problem with using 'attach' is the likelihood of namespace clashes, so you've got to remember to call 'detach'

It's much cleaner to use 'with':

tx = with( iris, tapply(sepal.len, list(species), mean) )

So, the call signature (informally) is: with( data, function() )

like image 30
doug Avatar answered Oct 15 '22 18:10

doug