Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to cumulatively add values in one vector in R

I have a data set that looks like this

id  name    year    job    job2
1   Jane    1980    Worker  0
1   Jane    1981    Manager 1
1   Jane    1982    Manager 1
1   Jane    1983    Manager 1
1   Jane    1984    Manager 1
1   Jane    1985    Manager 1
1   Jane    1986    Boss    0
1   Jane    1987    Boss    0
2   Bob     1985    Worker  0
2   Bob     1986    Worker  0
2   Bob     1987    Manager 1
2   Bob     1988    Boss    0
2   Bob     1989    Boss    0
2   Bob     1990    Boss    0
2   Bob     1991    Boss    0
2   Bob     1992    Boss    0

Here, job2 denotes a dummy variable indicating whether a person was a Manager during that year or not. I want to do two things to this data set: first, I only want to preserve the row when the person became Boss for the first time. Second, I would like to see cumulative years a person worked as a Manager and store this information in the variable cumu_job2. Thus I would like to have:

id  name    year    job    job2 cumu_job2
1   Jane    1980    Worker  0   0
1   Jane    1981    Manager 1   1
1   Jane    1982    Manager 1   2
1   Jane    1983    Manager 1   3
1   Jane    1984    Manager 1   4
1   Jane    1985    Manager 1   5
1   Jane    1986    Boss    0   0
2   Bob     1985    Worker  0   0
2   Bob     1986    Worker  0   0
2   Bob     1987    Manager 1   1
2   Bob     1988    Boss    0   0

I have changed my examples and included the Worker position because this reflects more what I want to do with the original data set. The answers in this thread only works when there are only Managers and Boss in the data set - so any suggestions for making this work would be great. I'll be very much grateful!!

like image 484
song0089 Avatar asked Jan 29 '14 02:01

song0089


People also ask

How to append elements to a vector in R?

Adding and removing elements from a Vector is one of the most used operations in R language. In this tutorial, we will see how to append single or multiple elements to a Vector using different approaches. To append elements to a Vector in R, use the append () method.

How to calculate the cumulative sum of a vector in R?

You can use the cumsum () function from base R to easily calculate the cumulative sum of a vector of numeric values. This tutorial explains how to use this function to calculate the cumulative sum of a vector along with how to visualize a cumulative sum.

Can a vector have more than one data type in R?

Vectors only hold elements of the same data type. If there is more than one data type, the c () function converts the elements. This is known as coercion. The conversion takes place from lower to higher types. logical < integer < double < complex < character. How to access elements of R vector?

How do you make an integer vector in R?

An easy way to make integer vectors is to use the : operator. What are the types of vectors in R? A vector can be of different types depending on the elements it contains. These may be: 1. Numeric Vectors Vectors containing numeric values. 2.


1 Answers

Here is the succinct dplyr solution for the same problem.

NOTE: Make sure that stringsAsFactors = FALSE while reading in the data.

library(dplyr)
dat %>%
  group_by(name, job) %>%
  filter(job != "Boss" | year == min(year)) %>%
  mutate(cumu_job2 = cumsum(job2))

Output:

   id name year     job job2 cumu_job2
1   1 Jane 1980  Worker    0         0
2   1 Jane 1981 Manager    1         1
3   1 Jane 1982 Manager    1         2
4   1 Jane 1983 Manager    1         3
5   1 Jane 1984 Manager    1         4
6   1 Jane 1985 Manager    1         5
7   1 Jane 1986    Boss    0         0
8   2  Bob 1985  Worker    0         0
9   2  Bob 1986  Worker    0         0
10  2  Bob 1987 Manager    1         1
11  2  Bob 1988    Boss    0         0

Explanation

  1. Take the dataset
  2. Group by name and job
  3. Filter each group based on condition
  4. Add cumu_job2 column.
like image 145
Ramnath Avatar answered Sep 28 '22 09:09

Ramnath