I am trying to find the column name with the same name as the text in another column called "region", and return the corresponding value. My data "df" looks like something similar to this
region A B C D E F
H 796 792 844 812 796 776
J 568 564 508 268 320 396
A 820 804 748 528 560 600
X 292 272 260 324 224 200
M 872 812 792 760 668 656
N 100 992 972 880 872 864
C 940 948 952 916 864 880
L 960 956 952 920 900 920
E 980 968 956 940 944 932
F 236 364 460 524 552 616
P 796 792 844 812 796 776
Q 568 564 508 268 320 396
And I want to get something that looks like this:
region A B C D E F
H NA NA NA NA NA NA
J NA NA NA NA NA NA
A 820 NA NA NA NA NA
X NA NA NA NA NA NA
M NA NA NA NA NA NA
N NA NA NA NA NA NA
C NA NA 952 NA NA NA
L NA NA NA NA NA NA
E NA NA NA NA 944 NA
F NA NA NA NA NA 616
P NA NA NA NA NA NA
Q NA NA NA NA NA NA
To do this, I tried this piece of code from this other questions (Loop that matches row to column names and computes an average of the 3 preceding columns) but it only returns the position, and I would like to get the value as shown in the example above.
apply (df, MARGIN = 1, FUN = function(x, i){
position <- (which(x[['region']] == colnames(df)))
})
How can I modify the code to get the real value? Thanks
A fifth option also using base
functions
idx <- na.omit(cbind(match(names(df1), df1$region),
1:length(df1)))
vals <- as.integer(df1[idx])
df1[-1] <- NA
df1[idx] <- vals
df1
# region A B C D E F
#1 H NA NA NA NA NA NA
#2 J NA NA NA NA NA NA
#3 A 820 NA NA NA NA NA
#4 X NA NA NA NA NA NA
#5 M NA NA NA NA NA NA
#6 N NA NA NA NA NA NA
#7 C NA NA 952 NA NA NA
#8 L NA NA NA NA NA NA
#9 E NA NA NA NA 944 NA
#10 F NA NA NA NA NA 616
#11 P NA NA NA NA NA NA
#12 Q NA NA NA NA NA NA
data
Thanks to @akrun
df1 <- structure(list(region = c("H", "J", "A", "X", "M", "N", "C",
"L", "E", "F", "P", "Q"), A = c(796L, 568L, 820L, 292L, 872L,
100L, 940L, 960L, 980L, 236L, 796L, 568L), B = c(792L, 564L,
804L, 272L, 812L, 992L, 948L, 956L, 968L, 364L, 792L, 564L),
C = c(844L, 508L, 748L, 260L, 792L, 972L, 952L, 952L, 956L,
460L, 844L, 508L), D = c(812L, 268L, 528L, 324L, 760L, 880L,
916L, 920L, 940L, 524L, 812L, 268L), E = c(796L, 320L, 560L,
224L, 668L, 872L, 864L, 900L, 944L, 552L, 796L, 320L), F = c(776L,
396L, 600L, 200L, 656L, 864L, 880L, 920L, 932L, 616L, 776L,
396L)), class = "data.frame", row.names = c(NA, -12L))
Here is one option with tidyverse
where we reshape into 'long' format with pivot_longer
, replace
the elements in 'value' where the 'region' is not equal to 'name' column value and then reshape back to 'wide' format
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = -region) %>%
mutate(value = replace(value, name!= region, NA)) %>%
pivot_wider(names_from = name, values_from = value)
# region A B C D E F
#1 H NA NA NA NA NA NA
#2 J NA NA NA NA NA NA
#3 A 820 NA NA NA NA NA
#4 X NA NA NA NA NA NA
#5 M NA NA NA NA NA NA
#6 N NA NA NA NA NA NA
#7 C NA NA 952 NA NA NA
#8 L NA NA NA NA NA NA
#9 E NA NA NA NA 944 NA
#10 F NA NA NA NA NA 616
#11 P NA NA NA NA NA NA
#12 Q NA NA NA NA NA NA
Another option is imap
library(purrr)
imap_dfc(df1[-1], ~ replace(.x, .y != df1[['region']], NA)) %>%
bind_cols(df1['region'], .)
# region A B C D E F
#1 H NA NA NA NA NA NA
#2 J NA NA NA NA NA NA
#3 A 820 NA NA NA NA NA
#4 X NA NA NA NA NA NA
#5 M NA NA NA NA NA NA
#6 N NA NA NA NA NA NA
#7 C NA NA 952 NA NA NA
#8 L NA NA NA NA NA NA
#9 E NA NA NA NA 944 NA
#10 F NA NA NA NA NA 616
#11 P NA NA NA NA NA NA
#12 Q NA NA NA NA NA NA
Or using base R
, we replicate the names
of the dataset and do a comparison with the 'region' column, change those values in those columns to NA
based on the comparison
df1[-1] <- NA^(df1$region != names(df1)[-1][col(df1[-1])]) * df1[-1]
df1 <- structure(list(region = c("H", "J", "A", "X", "M", "N", "C",
"L", "E", "F", "P", "Q"), A = c(796L, 568L, 820L, 292L, 872L,
100L, 940L, 960L, 980L, 236L, 796L, 568L), B = c(792L, 564L,
804L, 272L, 812L, 992L, 948L, 956L, 968L, 364L, 792L, 564L),
C = c(844L, 508L, 748L, 260L, 792L, 972L, 952L, 952L, 956L,
460L, 844L, 508L), D = c(812L, 268L, 528L, 324L, 760L, 880L,
916L, 920L, 940L, 524L, 812L, 268L), E = c(796L, 320L, 560L,
224L, 668L, 872L, 864L, 900L, 944L, 552L, 796L, 320L), F = c(776L,
396L, 600L, 200L, 656L, 864L, 880L, 920L, 932L, 616L, 776L,
396L)), class = "data.frame", row.names = c(NA, -12L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With