Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tidy way to convert numeric columns from counts to proportions

Tags:

r

dplyr

tidyverse

I want to convert only the numeric rows in the dataframe below into rowwise proportions.

df <- data.frame(
  "id" = c("A", "B", "C", "D"),
  "x" = c(1, 2, 3, 4),
  "y" = c(2, 4, 6, 8)
)

So df$x[1] <- should be converted to .3333 and df$y[1] should be.6666 and so on. I want to do this with tidy code dynamically without referring to any columns by name, and ignoring any non-numeric columns in the dataframe.

My current attempt, based on reading a number of similar posts, is the following

df %>%
  mutate_if(is.numeric, . / rowSums(across(where(is.numeric))))

This returns the following error: Error: across() must only be used inside dplyr verbs.

Please help!

like image 596
ADF Avatar asked Nov 29 '22 07:11

ADF


1 Answers

Rephrase to the following:

df %>%
  mutate_if(is.numeric, ~ . / rowSums(select(df, where(is.numeric))))

Output:

  id         x         y
1  A 0.3333333 0.6666667
2  B 0.3333333 0.6666667
3  C 0.3333333 0.6666667
4  D 0.3333333 0.6666667

Edit: If you want an answer that doesn't use any additional packages besides dplyr and base, and that can be piped more easily, here's one other (hacky) solution:

df %>%
  group_by(id) %>% 
  mutate(sum = as.character(rowSums(select(cur_data(), is.numeric)))) %>%
  summarise_if(is.numeric, ~ . / as.numeric(sum))

The usual dplyr ways of referring to the current data within a function (e.g. cur_data) don't seem to play nicely with rowSums in my original phrasing, so I took a slightly different approach here. There is likely a better way of doing this though, so I'm open to suggestions.

like image 130
Rory S Avatar answered Dec 04 '22 13:12

Rory S