Using dplyr, you can do something like this:
iris %>% head %>% mutate(sum=Sepal.Length + Sepal.Width) Sepal.Length Sepal.Width Petal.Length Petal.Width Species sum 1 5.1 3.5 1.4 0.2 setosa 8.6 2 4.9 3.0 1.4 0.2 setosa 7.9 3 4.7 3.2 1.3 0.2 setosa 7.9 4 4.6 3.1 1.5 0.2 setosa 7.7 5 5.0 3.6 1.4 0.2 setosa 8.6 6 5.4 3.9 1.7 0.4 setosa 9.3
But above, I referenced the columns by their column names. How can I use 1
and 2
, which are the column indices to achieve the same result?
Here I have the following, but I feel it's not as elegant.
iris %>% head %>% mutate(sum=apply(select(.,1,2),1,sum)) Sepal.Length Sepal.Width Petal.Length Petal.Width Species sum 1 5.1 3.5 1.4 0.2 setosa 8.6 2 4.9 3.0 1.4 0.2 setosa 7.9 3 4.7 3.2 1.3 0.2 setosa 7.9 4 4.6 3.1 1.5 0.2 setosa 7.7 5 5.0 3.6 1.4 0.2 setosa 8.6 6 5.4 3.9 1.7 0.4 setosa 9.3
To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.
mutate() is a dplyr function that adds new variables and preserves existing ones. That's what the documentation says. So when you want to add new variables or change one already in the dataset, that's your good ally. Given our dataset df , we can easily add columns with calculations.
Add a column to a dataframe in R using dplyr. In my opinion, the best way to add a column to a dataframe in R is with the mutate() function from dplyr .
You can try:
iris %>% head %>% mutate(sum = .[[1]] + .[[2]]) Sepal.Length Sepal.Width Petal.Length Petal.Width Species sum 1 5.1 3.5 1.4 0.2 setosa 8.6 2 4.9 3.0 1.4 0.2 setosa 7.9 3 4.7 3.2 1.3 0.2 setosa 7.9 4 4.6 3.1 1.5 0.2 setosa 7.7 5 5.0 3.6 1.4 0.2 setosa 8.6 6 5.4 3.9 1.7 0.4 setosa 9.3
I'm a bit late to the game, but my personal strategy in cases like this is to write my own tidyverse-compliant function that will do exactly what I want. By tidyverse-compliant, I mean that the first argument of the function is a data frame and that the output is a vector that can be added to the data frame.
sum_cols <- function(x, col1, col2){ x[[col1]] + x[[col2]] } iris %>% head %>% mutate(sum = sum_cols(x = ., col1 = 1, col2 = 2))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With