Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repeating rows based on the collective count mentioned in the respective columns

Tags:

r

How to repeat the rows per the count mentioned in their respective columns (considering multiple columns) in R?

data <- data.frame(
 city=c("A","B","C","D","E","F","G"),
 score=c(83,94,1,21,2,3,0),
 J=c(2,0,1,0,3,0,0),
 K=c(0,2,0,3,0,1,0),
 L=c(1,1,0,4,0,0,0))
data

Original data frame:

enter image description here

Required data frame:

enter image description here

Considering all column count, P.S. City D which is being repeated 4 times out of which 3 rows in column k have count 1 and 4 rows against column L has count 1 with respect to City D.

like image 963
nikki Avatar asked Mar 06 '23 13:03

nikki


2 Answers

Another data.table solution:

library(data.table)
setDT(data)
data[, lapply(.SD, function(x){
    g <- pmax(max(unlist(.SD)), 1)
    rep(1:0, c(x, g - x)) }), by = .(city, score)]

#     city score number number2 number3
#  1:    A    83      1       0       1
#  2:    A    83      1       0       0
#  3:    B    94      0       1       1
#  4:    B    94      0       1       0
#  5:    C     1      1       0       0
#  6:    D    21      0       1       1
#  7:    D    21      0       1       1
#  8:    D    21      0       1       1
#  9:    D    21      0       0       1
# 10:    E     2      1       0       0
# 11:    E     2      1       0       0
# 12:    E     2      1       0       0
# 13:    F     3      0       1       0
# 14:    G     0      0       0       0

Rows with all numbers equal to zero are properly handled. Replace g <- pmax(max(unlist(.SD)), 1) with g <- max(unlist(.SD)) if you do not want such rows:

data[, lapply(.SD, function(x){
    g <- max(unlist(.SD))
    rep(1:0, c(x, g - x)) }), by = .(city, score)]
like image 178
mt1022 Avatar answered May 02 '23 18:05

mt1022


data.table solution:

data: (make sure you don't have factors stringsAsFactors = F)

data <- data.frame(
    city=c("A","B","C","D","E","F","G"),
    score=c(83,94,1,21,2,3,0),
    number=c(2,0,1,0,3,0,0),
    number2=c(0,2,0,3,0,1,0),
    number3=c(1,1,0,4,0,0,0),stringsAsFactors = F)

code: (let's have a function fun1 that does the work for us)

data.table::setDT(data)

fun1 <- function(x) {
    transpose(
        transpose(
            lapply(x, function(u) if(u != 0) rep(1,u) else 0), fill = 0
        )
    )
}

data[, structure(fun1(.SD), .Names = names(.SD)), by = c("city","score")]

result:

 #   city score number number2 number3
 #1:    A    83      1       0       1
 #2:    A    83      1       0       0
 #3:    B    94      0       1       1
 #4:    B    94      0       1       0
 #5:    C     1      1       0       0
 #6:    D    21      0       1       1
 #7:    D    21      0       1       1
 #8:    D    21      0       1       1
 #9:    D    21      0       0       1
#10:    E     2      1       0       0
#11:    E     2      1       0       0
#12:    E     2      1       0       0
#13:    F     3      0       1       0
#14:    G     0      0       0       0
like image 35
Andre Elrico Avatar answered May 02 '23 19:05

Andre Elrico