I am trying to do something very similar to Scale relative to a value in each group (via dplyr) (however this solution seems to crash R for me). I would like to replicate a single value for each group and add a new column with this value repeated. As an example I have
library(dplyr)
data = expand.grid(
category = LETTERS[1:2],
year = 2000:2003)
data$value = runif(nrow(data))
data
category year value
1 A 2000 0.6278798
2 B 2000 0.6112281
3 A 2001 0.2170495
4 B 2001 0.6454874
5 A 2002 0.9234604
6 B 2002 0.9311204
7 A 2003 0.5387899
8 B 2003 0.5573527
And I would like a dataframe like
data
category year value value2
1 A 2000 0.6278798 0.6278798
2 B 2000 0.6112281 0.6112281
3 A 2001 0.2170495 0.6278798
4 B 2001 0.6454874 0.6112281
5 A 2002 0.9234604 0.6278798
6 B 2002 0.9311204 0.6112281
7 A 2003 0.5387899 0.6278798
8 B 2003 0.5573527 0.6112281
i.e. the value for each category is the value from year 2000. I was trying to think of a general solution extensible to a given filtering criteria, i.e. something like
data %>% group_by(category) %>% mutate(value = filter(data, year==2002))
however this does not work because of incorrect length in the assignment.
%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).
You can use the transmute() function in R to add new calculated variables to a data frame and drop all existing variables. In this example, a new variable called var_new will be created by multiplying an existing variable called var1 by 2.
In R programming, the mutate function is used to create a new variable from a data set. In order to use the function, we need to install the dplyr package, which is an add-on to R that includes a host of cool functions for selecting, filtering, grouping, and arranging data.
From dplyr github: The d is for dataframes, the plyr is to evoke pliers.
Do this:
data %>% group_by(category) %>%
mutate(value2 = value[year == 2000])
You could also do it this way:
data %>% group_by(category) %>%
arrange(year) %>%
mutate(value2 = value[1])
or
data %>% group_by(category) %>%
arrange(year) %>%
mutate(value2 = first(value))
or
data %>% group_by(category) %>%
mutate(value2 = nth(value, n = 1, order_by = "year"))
or probably several other ways.
Your attempt with mutate(value = filter(data, year==2002))
doesn't make sense for a few reasons.
When you explicitly pass in data
again, it's not part of the chain that got grouped earlier, so it doesn't know about the grouping.
All dplyr
verbs take a data frame as first argument and return a data frame, including filter
. When you do value = filter(...)
you're trying to assign a full data frame to the single column value
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With