My data are ordered observations and I want to keep the ordering as much as possible while doing manipulations.
Take the answer for this question, I put "B" ahead of "A" in the dataframe. The resulting wide data are sorted by the column "name", i.e., "A" first, then "B".
df = data.frame(name=c("B","B","A","A"),
group=c("g1","g2","g1","g2"),
V1=c(10,40,20,30),
V2=c(6,3,1,7))
gather(df, Var, Val, V1:V2) %>%
unite(VarG, Var, group) %>%
spread(VarG, Val)
name V1_g1 V1_g2 V2_g1 V2_g2
1 A 20 30 1 7
2 B 10 40 6 3
Is there a way to keep the original ordering? like this:
name V1_g1 V1_g2 V2_g1 V2_g2
1 B 10 40 6 3
2 A 20 30 1 7
04/02 edit: I've just found the dplyr::summarise
does sorting as well. arrange(name, df$name)
still works to restore the order. But I wonder if the extra sorting is necessary from the design of the packages?
df %>%
group_by(name) %>%
summarise(n()) %>%
name n()
1 A 2
2 B 2
You can sort by name based on the order in the original data frame:
gather(df, Var, Val, V1:V2) %>%
unite(VarG, Var, group) %>%
spread(VarG, Val) %>%
arrange( order(match(name, df$name)))
# name V1_g1 V1_g2 V2_g1 V2_g2
# 1 B 10 40 6 3
# 2 A 20 30 1 7
The order is taken from the order of the factor levels.
str(df)
'data.frame': 4 obs. of 4 variables:
$ name : Factor w/ 2 levels "A","B": 2 2 1 1
$ group: Factor w/ 2 levels "g1","g2": 1 2 1 2
$ V1 : num 10 40 20 30
$ V2 : num 6 3 1 7
See that the levels are "A","B".
So if you set the order of the levels to the order they are shown in it will work:
df = data.frame(name=c("B","B","A","A"),
group=c("g1","g2","g1","g2"),
V1=c(10,40,20,30),
V2=c(6,3,1,7))
df %>%
mutate(name = factor(name,levels=unique(name))) %>%
mutate(group = factor(group,levels=unique(group))) %>%
gather(Var, Val, V1:V2) %>%
unite(VarG, Var, group) %>%
spread(VarG, Val)
Results in:
name V1_g1 V1_g2 V2_g1 V2_g2
1 B 10 40 6 3
2 A 20 30 1 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With