Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R:Count the daily number of a variable distinguish per ID

Tags:

r

count

Here is my data:

ID        Date             v
ID1         1              v1
ID1         1              v1
ID1         1              v8
ID1         2              v5
ID1         2              v3
ID1         3              v3
ID2         1              v7
ID2         2              v15
ID2         2              v15
ID2         3              v3

I want to calculate the number of v distinguish per day and per ID. As my data above, I wanna get the result like:

ID        Date             v         daily_v_distinguish_ID
ID1         1              v1            2
ID1         1              v1            NA
ID1         1              v8            NA
ID1         2              v5            2
ID1         2              v3            NA
ID1         3              v3            1
ID2         1              v7            1
ID2         2              v15           1
ID2         2              v15           NA
ID2         3              v3            1

How to solve that? Thanks you in advance!

And Then, if I only want to calculate the daily number of v (NOT distingunish) per ID, how to change the code?

The expected result:

ID        Date             v         daily_v_distinguish_ID    daily_v_ID
ID1         1              v1            2                       3
ID1         1              v1            NA                      3
ID1         1              v8            NA                      3
ID1         2              v5            2                       2
ID1         2              v3            NA                      2
ID1         3              v3            1                       1
ID2         1              v7            1                       1
ID2         2              v15           1                       2
ID2         2              v15           NA                      2
ID2         3              v3            1                       1
like image 240
velvetrock Avatar asked Jul 09 '15 09:07

velvetrock


People also ask

How do I count observations by ID in R?

Just use table(Data$ID) or as. data. frame(table(Data$ID)) if you want a data. frame back.

How do I count specific values in R?

Method 2: Using sum() method in R The sum() method can be used to calculate the summation of the values appearing in the function argument. Here, we specify a logical expression as an argument of the sum() function which calculates the sum of values which are equivalent to the specified value.

How do I get the number of values in a column in R?

The ncol() function in R programming R programming helps us with ncol() function by which we can get the information on the count of the columns of the object. That is, ncol() function returns the total number of columns present in the object.


1 Answers

You can try using the devel version of data.table ie. v1.9.5. Instructions to install the devel version are here

library(data.table)#v1.9.5+
setDT(df1)[,  daily_v_ID:= ifelse((1:.N)==1L, uniqueN(v), NA) , by = .(ID, Date)]

Or

setDT(df1)[,  daily_v_ID := c(uniqueN(v), rep(NA, .N-1)), by = .(ID, Date)]

Or as suggested by @David Arenburg

indx <- setDT(df1)[, .(.I[1L], uniqueN(v)), by = .(ID, Date)] 
df1[indx$V1, daily_v_ID := indx$V2]

Or using dplyr

library(dplyr)
df1 %>% 
  group_by(ID,Date) %>%
  mutate(daily_v_ID= ifelse(row_number()==1, n_distinct(v), NA))

Or with base R

df1$daily_v_ID <- with(df1, ave(as.numeric(factor(v)), Date,ID,
      FUN= function(x) NA^(seq_along(x)!=1)*length(unique(x))))

Update

For the edited post, we create a variable ('daily_v_ID') by getting the length(v) or in the data.table, we can use .N

setDT(df1)[, c('daily_v_distinguish_ID', 'daily_v_ID'):= list( c(uniqueN(v),
                  rep(NA, .N-1)), .N), by = .(ID, Date)]
df1
#       ID Date   v daily_v_distinguish_ID daily_v_ID
#  1: ID1    1  v1                      2          3
#  2: ID1    1  v1                     NA          3
#  3: ID1    1  v8                     NA          3
#  4: ID1    2  v5                      2          2
#  5: ID1    2  v3                     NA          2
#  6: ID1    3  v3                      1          1
#  7: ID2    1  v7                      1          1
#  8: ID2    2 v15                      1          2
#  9: ID2    2 v15                     NA          2
# 10: ID2    3  v3                      1          1

NOTE: uniqueN is introduced in the v1.9.5. For earlier versions, we can use unique(length(v))

Or using dplyr

df1 %>% 
    group_by(ID, Date) %>%
    mutate(daily_v_distinguish_ID = ifelse(row_number()==1,
                                        n_distinct(v), NA), 
           daily_v_ID =n())
like image 68
akrun Avatar answered Nov 15 '22 04:11

akrun