Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide all elements in row with the max value in row - Faster approach

Tags:

r

I need to scale a dataframe.
The process I need to follow is the following:

Divide all elements in a row with the max number in that row, unless that row contains number 1

I use this approach:

post_df <- df # original dataframe
for(i in 1:nrow(df)){
    if (! 1 %in% df[i,]) {
        post_df[i,] <- df[i,]/max(df[i,])
    }
}

I was wondering if there is a faster approach that will cut down some seconds because I run this in a big dataframe 86000 rows *500 cols .

E.g

5 rows, 5 cols

Row 1: Divide all elements with 0.7
Row 2: Divide all elements with 0.4
Row 3: Ignore
Row 4: Ignore
Row 5: Ignore
enter image description here

like image 591
Panos Kal. Avatar asked Oct 26 '17 11:10

Panos Kal.


1 Answers

Based on the description, we need to only scale those rows that doesn't have 1. Create a logical index ('i1') based on rowSums and then subset the dataset using 'i1', get the max of each row with pmax, divide with the subset and assign it back to the subset

i1 <- !rowSums(df==1)>0
df[i1,] <- df[i1,]/do.call(pmax, df[i1,])

data

set.seed(24)
df <- as.data.frame(matrix(sample(1:8, 10*5, replace = TRUE), ncol=5))
like image 130
akrun Avatar answered Nov 02 '22 19:11

akrun