I have a data.frame like this
x <- data.frame(Category=factor(c("One", "One", "Four", "Two","Two",
"Three", "Two", "Four","Three")),
City=factor(c("D","A","B","B","A","D","A","C","C")),
Frequency=c(10,1,5,2,14,8,20,3,5))
Category City Frequency
1 One D 10
2 One A 1
3 Four B 5
4 Two B 2
5 Two A 14
6 Three D 8
7 Two A 20
8 Four C 3
9 Three C 5
I want to make a pivot table with sum(Frequency) and used the ddply function like this:
ddply(x,.(Category,City),summarize,Total=sum(Frequency))
Category City Total
1 Four B 5
2 Four C 3
3 One A 1
4 One D 10
5 Three C 5
6 Three D 8
7 Two A 34
8 Two B 2
But I need this results sorted by the total in each Category group. Something like this:
Category City Frequency
1 Two A 34
2 Two B 2
3 Three D 14
4 Three C 5
5 One D 10
6 One A 1
7 Four B 5
8 Four C 3
I have looked and tried sort, order, arrange, but nothing seems to do what I need. How can I do this in R?
Here is a base R version, where DF
is the result of your ddply
call:
with(DF, DF[order(-ave(Total, Category, FUN=sum), Category, -Total), ])
produces:
Category City Total
7 Two A 34
8 Two B 2
6 Three D 8
5 Three C 5
4 One D 10
3 One A 1
1 Four B 5
2 Four C 3
The logic is basically the same as David's, calculate the sum of Total
for each Category
, use that number for all rows in each Category
(we do this with ave(..., FUN=sum)
), and then sort by that plus some tie breakers to make sure stuff comes out as expected.
This is a nice question and I can't think of a straight way of doing this rather than creating a total size index and then sorting by it. Here's a possible data.table
approach which uses setorder
function which will order your data by reference
library(data.table)
Res <- setDT(x)[, .(Total = sum(Frequency)), by = .(Category, City)]
setorder(Res[, size := sum(Total), by = Category], -size, -Total, Category)[]
# Category City Total size
# 1: Two A 34 36
# 2: Two B 2 36
# 3: Three D 8 13
# 4: Three C 5 13
# 5: One D 10 11
# 6: One A 1 11
# 7: Four B 5 8
# 8: Four C 3 8
Or if you deep in the Hdleyverse, we can reach a similar result using the newer dplyr
package (as suggested by @akrun)
library(dplyr)
x %>%
group_by(Category, City) %>%
summarise(Total = sum(Frequency)) %>%
mutate(size= sum(Total)) %>%
ungroup %>%
arrange(-size, -Total, Category)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With