Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r : Ignore NA values with pmax function

Tags:

r

I'm trying to create a new column with the max values of 3 columns.

data exemple :

date          skyc1 skyc2 skyc3 
1995-01-01    0     1     3
1995-01-02    1     null  null
1995-01-03    1     3     null

I would like to get :

date          skyc1 skyc2 skyc3 max
1995-01-01    0     1     3     3
1995-01-02    1     null  null  1
1995-01-03    1     3     null  3

I tried using :

df$max <- pmax(df$skyc1,df$skyc2,df$skyc3)

But I get this :

date          skyc1 skyc2 skyc3 max
1995-01-01    0     1     3     3
1995-01-02    1     null  null  null
1995-01-03    1     3     null  null

Is it possible to consider null as 0 ? I have thought about replacing null to 0 but I have values that are actually 0 in my dataset...

Thanks

like image 592
Will_8011 Avatar asked Nov 25 '25 19:11

Will_8011


1 Answers

There is na.rm in pmax and as the values are null, we need to replace those null to NA before doing that and as "null" is a character string, the columns would be character or factor. So, we need to also change the type with type.convert before the pmax step

df[-1] <- replace(df[-1], df[-1] == "null", NA)
df <- type.convert(df, as.is = TRUE)
df$max <- pmax(df$skyc1, df$skyc2, df$skyc3, na.rm = TRUE)
df$max
#[1] 3 1 3

If there are many columns of 'skyc',then it can be automated as well

nm1 <- grep('^skyc\\d+$', names(df), value = TRUE)
df$max <- do.call(pmax, c(df[nm1], na.rm = TRUE))

data

df <-structure(list(date = c("1995-01-01", "1995-01-02", "1995-01-03"
), skyc1 = c(0L, 1L, 1L), skyc2 = c("1", "null", "3"), skyc3 = c("3", 
"null", "null")), class = "data.frame", row.names = c(NA, -3L
))
like image 58
akrun Avatar answered Nov 27 '25 08:11

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!