So I have data where many values (x) have been separated because of case issue and I would like to merge all these values ignoring case and simply adding the values in the other columns (y and z)
I have a dataframe like:
x y z
rain 2 40
Rain 4 50
RAIN 7 25
Wind 8 10
Snow 3 9
SNOW 11 25
I want a Dataframe like:
x y z
Rain 13 115
Wind 8 10
Snow 14 34
First, select the rows you want to merge then open the Home tab and expand Merge & Centre. From these options select Merge Cells. After selecting Merge Cells it will pop up a message which values it is going to keep. Then click on OK.
First of all, create a data frame. Then, using plus sign (+) to add two rows and store the addition in one of the rows. After that, remove the row that is not required by subsetting with single square brackets.
You could lower the caps on the first column and then aggregate.
Option 1: base R's aggregate()
with(df, aggregate(list(y = y, z = z), list(x = tolower(x)), sum))
# x y z
# 1 rain 13 115
# 2 snow 14 34
# 3 wind 8 10
Alternatively, the formula method could also be used.
aggregate(. ~ x, transform(df, x = tolower(x)), sum)
Option 2: data.table. This also keeps the order you show in the result.
library(data.table)
as.data.table(df)[, lapply(.SD, sum), by = .(x = tolower(x))]
# x y z
# 1: rain 13 115
# 2: wind 8 10
# 3: snow 14 34
To order the result, use keyby
instead of by
Option 3: base R's xtabs()
xtabs(cbind(y = y, z = z) ~ tolower(x), df)
#
# tolower(x) y z
# rain 13 115
# snow 14 34
# wind 8 10
although this results in a table (probably not what you want, but worth noting), and I have yet to determine how to change the name on the x
result.
Data:
df <- tructure(list(x = structure(c(1L, 2L, 3L, 6L, 4L, 5L), .Label = c("rain",
"Rain", "RAIN", "Snow", "SNOW", "Wind"), class = "factor"), y = c(2L,
4L, 7L, 8L, 3L, 11L), z = c(40L, 50L, 25L, 10L, 9L, 25L)), .Names = c("x",
"y", "z"), class = "data.frame", row.names = c(NA, -6L))
Try:
library(dplyr)
df %>%
group_by(x = tolower(x)) %>%
summarise_each(funs(sum))
Which gives:
#Source: local data frame [3 x 3]
#
# x y z
# (chr) (int) (int)
#1 rain 13 115
#2 snow 14 34
#3 wind 8 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With