Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select data frame values row-wise using a variable of column names

Suppose I have a data frame that looks like this:

dframe = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
#   x y
# 1 1 4
# 2 2 5
# 3 3 6

And a vector of column names, one per row of the data frame:

colname = c('x', 'y', 'x')

For each row of the data frame, I would like to select the value from the corresponding column in the vector. Something similar to dframe[, colname] but for each row.

Thus, I want to obtain c(1, 5, 3) (i.e. row 1: col "x"; row 2: col "y"; row 3: col "x")

like image 831
rimorob Avatar asked Jun 29 '18 03:06

rimorob


People also ask

How do you select rows from a Dataframe based on column values in R?

Select Rows by list of Column Values. By using the same notation you can also use an operator %in% to select the DataFrame rows based on a list of values. The following example returns all rows when state values are present in vector values c('CA','AZ','PH') .

How do you subset a Dataframe in R based on column names?

3.1 Subset by Column Name Let's use the same df[] notation and subset() function to subset the data frame by column name in R. To subset columns use select argument with values as column names to subset() .

How do I select specific rows and columns from a Dataframe in R?

To select a specific column, you can also type in the name of the dataframe, followed by a $ , and then the name of the column you are looking to select. In this example, we will be selecting the payment column of the dataframe. When running this script, R will simplify the result as a vector.

How do you select a variable from a data frame?

We can select a variable from a data frame using select() function in two ways. One way is to specify the dataframe name and the variable/column name we want to select as arguments to select() function in dplyr. In this example below, we select species column from penguins data frame.


2 Answers

My favourite old matrix-indexing will take care of this. Just pass a 2-column matrix with the respective row/column index:

rownames(dframe) <- seq_len(nrow(dframe))
dframe[cbind(rownames(dframe),colname)]
#[1] 1 5 3

Or, if you don't want to add rownames:

dframe[cbind(seq_len(nrow(dframe)), match(colname,names(dframe)))]
#[1] 1 5 3
like image 89
thelatemail Avatar answered Oct 19 '22 23:10

thelatemail


One can use mapply to pass arguments for rownumber (of dframe) and vector for column name (for each row) to return specific column value.

The solution using mapply can be as:

dframe = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
colname = c('x', 'y', 'x')

mapply(function(x,y)dframe[x,y],1:nrow(dframe),  colname)

#[1] 1 5 3

Although, the next option may not be very intuitive but if someone wants a solution in dplyr chain then a way using gather can be as:

library(tidyverse)

data.frame(colname = c('x', 'y', 'x'), stringsAsFactors = FALSE) %>%
  rownames_to_column() %>%
  left_join(dframe %>% rownames_to_column() %>%
              gather(colname, value, -rowname), 
            by = c("rowname", "colname" )) %>%
  select(rowname, value)

#   rowname value
# 1       1     1
# 2       2     5
# 3       3     3
like image 1
MKR Avatar answered Oct 19 '22 23:10

MKR