Suppose I have a data frame that looks like this:
dframe = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
# x y
# 1 1 4
# 2 2 5
# 3 3 6
And a vector of column names, one per row of the data frame:
colname = c('x', 'y', 'x')
For each row of the data frame, I would like to select the value from the corresponding column in the vector. Something similar to dframe[, colname]
but for each row.
Thus, I want to obtain c(1, 5, 3)
(i.e. row 1: col "x"; row 2: col "y"; row 3: col "x")
Select Rows by list of Column Values. By using the same notation you can also use an operator %in% to select the DataFrame rows based on a list of values. The following example returns all rows when state values are present in vector values c('CA','AZ','PH') .
3.1 Subset by Column Name Let's use the same df[] notation and subset() function to subset the data frame by column name in R. To subset columns use select argument with values as column names to subset() .
To select a specific column, you can also type in the name of the dataframe, followed by a $ , and then the name of the column you are looking to select. In this example, we will be selecting the payment column of the dataframe. When running this script, R will simplify the result as a vector.
We can select a variable from a data frame using select() function in two ways. One way is to specify the dataframe name and the variable/column name we want to select as arguments to select() function in dplyr. In this example below, we select species column from penguins data frame.
My favourite old matrix-indexing will take care of this. Just pass a 2-column matrix with the respective row/column index:
rownames(dframe) <- seq_len(nrow(dframe))
dframe[cbind(rownames(dframe),colname)]
#[1] 1 5 3
Or, if you don't want to add rownames:
dframe[cbind(seq_len(nrow(dframe)), match(colname,names(dframe)))]
#[1] 1 5 3
One can use mapply
to pass arguments for rownumber
(of dframe
) and vector for column name (for each row) to return specific column value.
The solution using mapply
can be as:
dframe = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
colname = c('x', 'y', 'x')
mapply(function(x,y)dframe[x,y],1:nrow(dframe), colname)
#[1] 1 5 3
Although, the next option may not be very intuitive but if someone wants a solution in dplyr
chain then a way using gather
can be as:
library(tidyverse)
data.frame(colname = c('x', 'y', 'x'), stringsAsFactors = FALSE) %>%
rownames_to_column() %>%
left_join(dframe %>% rownames_to_column() %>%
gather(colname, value, -rowname),
by = c("rowname", "colname" )) %>%
select(rowname, value)
# rowname value
# 1 1 1
# 2 2 5
# 3 3 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With