I have a data frame: <pre class="prettyprint"><code>Y X1 X2 X3 1 1 0 1 1 0 1 1 0 1 0 1 0 0 0 1 1 1 1 0 0 1 1 0 </code></pre> I want sum over all rows in <code>Y</code> column based on other columns that equal to <code>1</code>, which is <code>sum(Y=1|Xi =1</code>). For example, for column <code>X1</code>, <code>s1 = sum(Y=1|Xi =1) =1 + 0 +1+0 =2</code> <pre class="prettyprint"><code>Y X1 1 1 0 1 1 1 0 1 </code></pre> For <code>X2</code> column, the <code>s2 = sum(Y=1|Xi =1) = 0 +1+0 =1</code> <pre class="prettyprint"><code> Y X2 0 1 1 1 0 1 </code></pre> For <code>X3</code> column, the <code>s3 = sum(Y=1|Xi =1) = 1+1 +0+0 =2</code> <pre class="prettyprint"><code> Y X3 1 1 1 1 0 1 0 1 </code></pre> I have a rough idea to use <code>apply(df, 2, sum)</code> for the column of the dataframe, but I have no idea how to subset each column based on <code>Xi</code>, then calculate the <code>sum</code> of <code>Y.</code> Any help is appreciated!

There are numerous ways to do this. One is getting a subset based on the column you want: <pre class="prettyprint"><code>sum(df[df$X1==1,]$Y) </code></pre> This should work for you.

Calculate sum of one column based on another column

Tags:

dataframe

r

subset

I have a data frame:

Click to copy

Y  X1  X2  X3
1   1   0  1
1   0   1  1
0   1   0  1
0   0   0  1
1   1   1  0
0   1   1  0

I want sum over all rows in Y column based on other columns that equal to 1, which is sum(Y=1|Xi =1). For example, for column X1, s1 = sum(Y=1|Xi =1) =1 + 0 +1+0 =2

Click to copy

For X2 column, the s2 = sum(Y=1|Xi =1) = 0 +1+0 =1

Click to copy

For X3 column, the s3 = sum(Y=1|Xi =1) = 1+1 +0+0 =2

Click to copy

I have a rough idea to use apply(df, 2, sum) for the column of the dataframe, but I have no idea how to subset each column based on Xi, then calculate the sum of Y. Any help is appreciated!

487

asked Mar 27 '17 21:03

Jassy.W

1 Answers

There are numerous ways to do this. One is getting a subset based on the column you want:

Click to copy

sum(df[df$X1==1,]$Y)

This should work for you.

answered Sep 30 '22 00:09

M--

Related questions
                            
                                Using dplyr summarise in R with dynamic variable
                            
                                Side-by-side rgl plots with R Markdown
                            
                                Group by a column and sort by another column in R
                            
                                dplyr: How to handle multiple value
                            
                                shiny Error in match.arg(position) : 'arg' must be NULL or a character vector
                            
                                Delete unconnected short paths from a graph in igraph
                            
                                Return a matrix with `ifelse`
                            
                                Plot sine curve in R
                            
                                create sequence of numbers with leading zeroes [duplicate]
                            
                                Automatic loading of data from sysdata.rda in package
                            
                                Making symbols bold in ggplot2
                            
                                Rcpparmadillo: can't call Fortran routine "dgebal"?
                            
                                S4 object with a pointer to a C struct
                            
                                Merge columns of a dataframe by two conditions using aggregate
                            
                                How to select unique columns in an R matrix
                            
                                Cannot Install R Packages in Docker Image
                            
                                Creating a New Variable Based on a Categorical Variable Already in the Dataset
                            
                                gather with multiple keys [duplicate]
                            
                                Regular expression matching on comma bounded by nonwhite space
                            
                                R: grep returns 0 when x clearly in y (I checked no spaces)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Calculate sum of one column based on another column

Tags:

dataframe

r

subset

Jassy.W

People also ask

1 Answers

M--

Recent Activity

Donate For Us