In base <code>R</code> I would do the following: <pre class="prettyprint"><code>d <- data.frame(a = 1:4, b = 4:1, c = 2:5) apply(d, 1, which.max) </code></pre> With <code>dplyr</code> I could do the following: <pre class="prettyprint"><code>library(dplyr) d %>% mutate(u = purrr::pmap_int(list(a, b, c), function(...) which.max(c(...)))) </code></pre> If there’s another column in <code>d</code> I need to specify it, but I want this to work w/ an arbitrary amount if columns. Conceptually, I’d like something like <pre class="prettyprint"><code>pmap_int(list(everything()), ...) pmap_int(list(.), ...) </code></pre> But this does obviously not work. How would I solve that canonically with <code>dplyr</code>?

We just need the data to be specified as <code>.</code> as <code>data.frame</code> is a <code>list</code> with columns as list elements. If we wrap <code>list(.)</code>, it becomes a nested list <pre class="prettyprint"><code>library(dplyr) d %>% mutate(u = pmap_int(., ~ which.max(c(...)))) # a b c u #1 1 4 2 2 #2 2 3 3 2 #3 3 2 4 3 #4 4 1 5 3 </code></pre> <hr> Or can use <code>cur_data()</code> <pre class="prettyprint"><code>d %>% mutate(u = pmap_int(cur_data(), ~ which.max(c(...)))) </code></pre> <hr> Or if we want to use <code>everything()</code>, place that inside <code>select</code> as <code>list(everything())</code> doesn't address the data from which everything should be selected <pre class="prettyprint"><code>d %>% mutate(u = pmap_int(select(., everything()), ~ which.max(c(...)))) </code></pre> <hr> Or using <code>rowwise</code> <pre class="prettyprint"><code>d %>% rowwise %>% mutate(u = which.max(cur_data())) %>% ungroup # A tibble: 4 x 4 # a b c u # <int> <int> <int> <int> #1 1 4 2 2 #2 2 3 3 2 #3 3 2 4 3 #4 4 1 5 3 </code></pre> <hr> Or this is more efficient with <code>max.col</code> <pre class="prettyprint"><code>max.col(d, 'first') #[1] 2 2 3 3 </code></pre> Or with <code>collapse</code> <pre class="prettyprint"><code>library(collapse) dapply(d, which.max, MARGIN = 1) #[1] 2 2 3 3 </code></pre> which can be included in <code>dplyr</code> as <pre class="prettyprint"><code>d %>% mutate(u = max.col(cur_data(), 'first')) </code></pre>

Here are some <code>data.table</code> options <pre class="prettyprint"><code>setDT(d)[, u := which.max(unlist(.SD)), 1:nrow(d)] </code></pre> or <pre class="prettyprint"><code>setDT(d)[, u := max.col(.SD, "first")] </code></pre>

Apply function to a row in a data.frame using dplyr

In base R I would do the following:

d <- data.frame(a = 1:4, b = 4:1, c = 2:5)
apply(d, 1, which.max)

With dplyr I could do the following:

library(dplyr)
d %>% mutate(u = purrr::pmap_int(list(a, b, c), function(...) which.max(c(...))))

If there’s another column in d I need to specify it, but I want this to work w/ an arbitrary amount if columns.

Conceptually, I’d like something like

pmap_int(list(everything()), ...)
pmap_int(list(.), ...)

But this does obviously not work. How would I solve that canonically with dplyr?

How do I apply a function to each row in a Dataframe in R?

You can use the apply() function to apply a function to each row in a matrix or data frame in R.

How do I apply a function to a Dataframe in R?

In R Programming Language to apply a function to every integer type value in a data frame, we can use lapply function from dplyr package. And if the datatype of values is string then we can use paste() with lapply.

How do I apply a function across a column in R?

apply() lets you perform a function across a data frame's rows or columns. In the arguments, you specify what you want as follows: apply(X = data. frame, MARGIN = 1, FUN = function. you.

How do I apply a function to each column in a Dataframe in R?

Apply any function to all R data frame You can set the MARGIN argument to c(1, 2) or, equivalently, to 1:2 to apply the function to each value of the data frame. If you set MARGIN = c(2, 1) instead of c(1, 2) the output will be the same matrix but transposed. The output is of class “matrix” instead of “data.

We just need the data to be specified as . as data.frame is a list with columns as list elements. If we wrap list(.), it becomes a nested list

library(dplyr)
d %>% 
  mutate(u = pmap_int(., ~ which.max(c(...))))
#  a b c u
#1 1 4 2 2
#2 2 3 3 2
#3 3 2 4 3
#4 4 1 5 3

Or can use cur_data()

d %>%
   mutate(u = pmap_int(cur_data(), ~ which.max(c(...))))

Or if we want to use everything(), place that inside select as list(everything()) doesn't address the data from which everything should be selected

d %>% 
   mutate(u = pmap_int(select(., everything()), ~ which.max(c(...))))

Or using rowwise

d %>%
    rowwise %>% 
    mutate(u = which.max(cur_data())) %>%
    ungroup
# A tibble: 4 x 4
#      a     b     c     u
#  <int> <int> <int> <int>
#1     1     4     2     2
#2     2     3     3     2
#3     3     2     4     3
#4     4     1     5     3

Or this is more efficient with max.col

max.col(d, 'first')
#[1] 2 2 3 3

Or with collapse

library(collapse)
dapply(d, which.max, MARGIN = 1)
#[1] 2 2 3 3

which can be included in dplyr as

d %>% 
    mutate(u = max.col(cur_data(), 'first'))

Here are some data.table options

setDT(d)[, u := which.max(unlist(.SD)), 1:nrow(d)]

or

setDT(d)[, u := max.col(.SD, "first")]

Apply function to a row in a data.frame using dplyr

Tags:

r

dplyr

purrr

tidyverse

thothal

People also ask

2 Answers

akrun

ThomasIsCoding

Recent Activity

Donate For Us

Apply function to a row in a data.frame using dplyr

Tags:

r

dplyr

purrr

tidyverse

thothal

People also ask

2 Answers

akrun

ThomasIsCoding

Related questions

Recent Activity

Donate For Us