Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiplying and Adding Values across Rows

I have this data frame:

color <- c("AKZ", "ZZA", "KAK")    
color_1 <- sample(color, 100, replace=TRUE, prob=c(0.4, 0.3, 0.3))
id = 1:100

sample_data = data.frame(id, color_1)


 id color_1
1  1     KAK
2  2     AKZ
3  3     KAK
4  4     KAK
5  5     AKZ
6  6     ZZA

Suppose there is a legend:

  • K = 3
  • A = 4
  • Z = 6

I want to add two columns to the above data frame:

  • sample_data$add_score : e.g. KAK = K + A + K = 3 + 4 + 3 = 10
  • sample_data$multiply_score : e.g. KAK = K * A * K = 3 * 4 * 3 = 36

I thought of solving the problem like this:

sample_data$first = substr(color_1,1,1)
sample_data$second = substr(color_1,2,2)
sample_data$third = substr(color_1,3,3)

sample_data$first_score = ifelse(sample_data$first == "K", 3, ifelse(sample_data$first == "A", 4, 6))
 
sample_data$second_score = ifelse(sample_data$second == "K", 3, ifelse(sample_data$second == "A", 4, 6))

sample_data$third_score = ifelse(sample_data$third == "K", 3, ifelse(sample_data$third == "A", 4, 6))

sample_data$add_score = sample_data$first_score + sample_data$second_score + sample_data$third_score

sample_data$multiply_score = sample_data$first_score * sample_data$second_score * sample_data$third_score

But I think this way would take a long time if the length of "color_1" was longer. Given a scoring legend, is there a faster way to do this?

Thank you!

like image 651
stats_noob Avatar asked Jun 10 '26 21:06

stats_noob


1 Answers

Here is a way.
The main trick is to strsplit into single characters and match these vectors with the legend. Then add or multiply the matching numbers.

set.seed(2022)
color <- c("AKZ", "ZZA", "KAK")    
color_1 <- sample(color, 100, replace=TRUE, prob=c(0.4, 0.3, 0.3))
id <- 1:100

sample_data = data.frame(id, color_1)

legend <- setNames(c(3, 4, 6), c("K", "A", "Z"))

add_mul <- function(x, l){
  add <- function(y, l){
    i <- match(y, names(l))
    sum(l[i])
  }
  mul <- function(y, l){
    i <- match(y, names(l))
    prod(l[i])
  }
  
  s <- strsplit(x, "")
  add_score <- sapply(s, add, l = l)
  mul_score <- sapply(s, mul, l = l)
  data.frame(add_score, mul_score)
}

sample_data <- cbind(sample_data, add_mul(sample_data$color_1, legend))
head(sample_data)
#>   id color_1 add_score mul_score
#> 1  1     ZZA        16       144
#> 2  2     KAK        10        36
#> 3  3     AKZ        13        72
#> 4  4     KAK        10        36
#> 5  5     AKZ        13        72
#> 6  6     KAK        10        36

Created on 2022-03-10 by the reprex package (v2.0.1)

like image 121
Rui Barradas Avatar answered Jun 13 '26 12:06

Rui Barradas