Let's say I have this dataframe:
> df <- data.frame(A=1:5, B=c(0, 0, 3, 0, 0), C=c(1, 0, 0, 1, 0), D=c(0, 2, 0, 0, 1))
> df
A B C D
1 1 0 1 0
2 2 0 0 2
3 3 3 0 0
4 4 0 1 0
5 5 0 0 1
How would I go about converting it to:
A B C D
1 1 0 1 0
2 2 0 0 1
3 2 0 0 1
4 3 1 0 0
5 3 1 0 0
6 3 1 0 0
7 4 0 1 0
8 5 0 0 1
As you can see there are value 2
and 3
, I want to repeat them by that length and change the values back to 1. How would I do that?
I also want to duplicate the the A
column as you can see.
I tried:
replace(df[rep(rownames(df), select(df, -A)),], 2, 1)
But it gives me an error.
repeat(3) will create a list where each index value will be repeated 3 times and df. iloc[df. index. repeat(3),:] will help generate a dataframe with the rows as exactly returned by this list.
Click in a cell, or select multiple cells that you want to split. Under Table Tools, on the Layout tab, in the Merge group, click Split Cells. Enter the number of columns or rows that you want to split the selected cells into.
One option would be to get max value from columns B
, C
and D
using pmax
, use uncount
to repeat the rows. Use pmin
to replace the values greater than 1 to 1.
library(dplyr)
library(tidyr)
df %>%
mutate(repeat_row = pmax(B, C, D)) %>%
uncount(repeat_row) %>%
mutate(across(-A, pmin, 1))
# A B C D
#1 1 0 1 0
#2 2 0 0 1
#3 2 0 0 1
#4 3 1 0 0
#5 3 1 0 0
#6 3 1 0 0
#7 4 0 1 0
#8 5 0 0 1
Apparently, there's just one value > 0 in columns B to D, so we can exploit the partial rowSums
for a replicate
call on columns B to D binarized using > 0
. So that we can use this in Map
, we t
ranspose twice. Rest is cosmetics.
t(do.call(cbind, Map(replicate,
rowSums(df[-1]),
as.data.frame(t(cbind(df[1], df[-1] > 0)))))) |>
as.data.frame() |>
setNames(names(df))
# A B C D
# 1 1 0 1 0
# 2 2 0 0 1
# 3 2 0 0 1
# 4 3 1 0 0
# 5 3 1 0 0
# 6 3 1 0 0
# 7 4 0 1 0
# 8 5 0 0 1
Note: R>=4.1 used.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With