let say I have a data frame <code>df</code> like that <pre class="prettyprint"><code> txt A1 A2 B1 B2 1 ala 6 9 12 23 2 ata 1 3 3 11 .... </code></pre> I would like to use <code>dplyr</code> for filtering the rows based on a sum of a range of variables. I tried: <pre class="prettyprint"><code>filter(df, sum(A2:B1)>10) </code></pre> .... but it does not work. Could anyone suggest a solution in <code>dplyr</code>? And yes I know it can be done differently by simple subsetting.

I think the most <code>dplyr</code>-esque way would be: <pre class="prettyprint"><code>df %>% filter(rowSums(select_(., 'A2:B1')) > 10) </code></pre> Which gives: <pre class="prettyprint"><code># txt A1 A2 B1 B2 #1 ala 6 9 12 23 </code></pre>

We can get the indexes first and then use <code>rowSums</code>, <pre class="prettyprint"><code>v1 <- which(names(df) == 'A2') #find first column #[1] 3 v2 <- which(names(df) == 'B1') #find last column #[1] 4 df[rowSums(df[v1:v2])>10,] # txt A1 A2 B1 B2 #1 ala 6 9 12 23 </code></pre>

Referencing a range of columns in dplyr

Tags:

r

filter

dplyr

sum

let say I have a data frame df like that

    txt    A1    A2    B1    B2
1   ala    6      9    12    23
2   ata    1      3    3     11
....

I would like to use dplyr for filtering the rows based on a sum of a range of variables. I tried:

filter(df, sum(A2:B1)>10)

.... but it does not work.

Could anyone suggest a solution in dplyr? And yes I know it can be done differently by simple subsetting.

818

asked Jun 10 '16 13:06

kwicher

2 Answers

I think the most dplyr-esque way would be:

df %>%
  filter(rowSums(select_(., 'A2:B1')) > 10)

Which gives:

#  txt A1 A2 B1 B2
#1 ala  6  9 12 23

answered Oct 11 '22 00:10

Steven Beaupré

We can get the indexes first and then use rowSums,

v1 <- which(names(df) == 'A2') #find first column
#[1] 3
v2 <- which(names(df) == 'B1') #find last column
#[1] 4
df[rowSums(df[v1:v2])>10,]
#  txt A1 A2 B1 B2
#1 ala  6  9 12 23

answered Oct 11 '22 00:10

Sotos

Related questions
                            
                                R - using "next" statement in apply function
                            
                                R - Error using summary() from speedglm package
                            
                                R: data.table vs merge(aggregate()) performance
                            
                                Error in Cross Validation in GLMNET package R for Binomial Target Variable
                            
                                ggplot2: add distribution jitter near the legend bar
                            
                                Splitting rows with uneven string length into columns in R using tidyr [duplicate]
                            
                                R alignment of axis labels with expressions
                            
                                Extracting a number following specific text in R
                            
                                knitr: Add figure notes
                            
                                ggplot GLM fitted curve without interaction
                            
                                r Shiny: renderImage from www
                            
                                How to specify different random effects in nlme vs. lme4?
                            
                                R Syntax Highlighting for Confluence
                            
                                ggplot2: boxplot with colors and text labels mapped to combination of two categorical variables
                            
                                igraph does not apply edge.width for negative correlation coefficients
                            
                                Reproduce a 'The Economist' chart with dual axis
                            
                                Multiply previous row value by constant R
                            
                                Date roll-up in R
                            
                                R ggplot geom_jitter duplicates outlier
                            
                                Time series plot gets offset by 2 hours if scale_x_datetime is used

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With