Context: I want to add cumulative sum column to my tibble named words_uni. I used library(dplyr), function mutate. I work with R version 3.4.1 64 bit - Windows 10 and RStudio Version 1.0.143
> head(words_uni)
# A tibble: 6 x 3
# Groups: Type [6]
Type Freq per
<chr> <int> <dbl>
1 the 937839 0.010725848
2 i 918552 0.010505267
3 to 788892 0.009022376
4 a 615082 0.007034551
Then I did the following:
> words_uni1 = words_uni %>%
mutate( acum= cumsum(per))
> head(words_uni1)
# A tibble: 6 x 4
# Groups: Type [6]
Type Freq per acum
<chr> <int> <dbl> <dbl>
1 the 937839 0.010725848 0.010725848
2 i 918552 0.010505267 0.010505267
3 to 788892 0.009022376 0.009022376
4 a 615082 0.007034551 0.007034551
Problem: It is not doing what I was expecting, and I cannot see why.
I would appreciate your comments. Thanks in advance.
You must have previously grouped the tibble by type. This causes your mutate
call to calculate it by type.
Here is some reproducible code:
require(readr)
require(dplyr)
x <- read_csv("type, freq, per
the, 937839, 0.010725848
i, 918552, 0.010505267
to, 788892, 0.009022376
a, 615082, 0.007034551")
### ungrouped tibble, desired results
x %>% mutate(acum = cumsum(per))
# A tibble: 4 x 4
type freq per acum
<chr> <int> <dbl> <dbl>
1 the 937839 0.010725848 0.01072585
2 i 918552 0.010505267 0.02123112
3 to 788892 0.009022376 0.03025349
4 a 615082 0.007034551 0.03728804
### grouped tibble
x %>% group_by(type) %>% mutate(acum = cumsum(per))
# A tibble: 4 x 4
# Groups: type [4]
type freq per acum
<chr> <int> <dbl> <dbl>
1 the 937839 0.010725848 0.010725848
2 i 918552 0.010505267 0.010505267
3 to 788892 0.009022376 0.009022376
4 a 615082 0.007034551 0.007034551
You need to simply ungroup your data.
word_uni %>% ungroup() %>% mutate(acum = cumsum(per))
Should do the trick.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With