Here is my data:
ID Date v
ID1 1 v1
ID1 1 v1
ID1 1 v8
ID1 2 v5
ID1 2 v3
ID1 3 v3
ID2 1 v7
ID2 2 v15
ID2 2 v15
ID2 3 v3
I want to calculate the number of v distinguish per day and per ID. As my data above, I wanna get the result like:
ID Date v daily_v_distinguish_ID
ID1 1 v1 2
ID1 1 v1 NA
ID1 1 v8 NA
ID1 2 v5 2
ID1 2 v3 NA
ID1 3 v3 1
ID2 1 v7 1
ID2 2 v15 1
ID2 2 v15 NA
ID2 3 v3 1
How to solve that? Thanks you in advance!
And Then, if I only want to calculate the daily number of v (NOT distingunish) per ID, how to change the code?
The expected result:
ID Date v daily_v_distinguish_ID daily_v_ID
ID1 1 v1 2 3
ID1 1 v1 NA 3
ID1 1 v8 NA 3
ID1 2 v5 2 2
ID1 2 v3 NA 2
ID1 3 v3 1 1
ID2 1 v7 1 1
ID2 2 v15 1 2
ID2 2 v15 NA 2
ID2 3 v3 1 1
Just use table(Data$ID) or as. data. frame(table(Data$ID)) if you want a data. frame back.
Method 2: Using sum() method in R The sum() method can be used to calculate the summation of the values appearing in the function argument. Here, we specify a logical expression as an argument of the sum() function which calculates the sum of values which are equivalent to the specified value.
The ncol() function in R programming R programming helps us with ncol() function by which we can get the information on the count of the columns of the object. That is, ncol() function returns the total number of columns present in the object.
You can try using the devel
version of data.table
ie. v1.9.5
. Instructions to install the devel version are here
library(data.table)#v1.9.5+
setDT(df1)[, daily_v_ID:= ifelse((1:.N)==1L, uniqueN(v), NA) , by = .(ID, Date)]
Or
setDT(df1)[, daily_v_ID := c(uniqueN(v), rep(NA, .N-1)), by = .(ID, Date)]
Or as suggested by @David Arenburg
indx <- setDT(df1)[, .(.I[1L], uniqueN(v)), by = .(ID, Date)]
df1[indx$V1, daily_v_ID := indx$V2]
Or using dplyr
library(dplyr)
df1 %>%
group_by(ID,Date) %>%
mutate(daily_v_ID= ifelse(row_number()==1, n_distinct(v), NA))
Or with base R
df1$daily_v_ID <- with(df1, ave(as.numeric(factor(v)), Date,ID,
FUN= function(x) NA^(seq_along(x)!=1)*length(unique(x))))
For the edited post, we create a variable ('daily_v_ID') by getting the length(v)
or in the data.table
, we can use .N
setDT(df1)[, c('daily_v_distinguish_ID', 'daily_v_ID'):= list( c(uniqueN(v),
rep(NA, .N-1)), .N), by = .(ID, Date)]
df1
# ID Date v daily_v_distinguish_ID daily_v_ID
# 1: ID1 1 v1 2 3
# 2: ID1 1 v1 NA 3
# 3: ID1 1 v8 NA 3
# 4: ID1 2 v5 2 2
# 5: ID1 2 v3 NA 2
# 6: ID1 3 v3 1 1
# 7: ID2 1 v7 1 1
# 8: ID2 2 v15 1 2
# 9: ID2 2 v15 NA 2
# 10: ID2 3 v3 1 1
NOTE: uniqueN
is introduced in the v1.9.5
. For earlier versions, we can use unique(length(v))
Or using dplyr
df1 %>%
group_by(ID, Date) %>%
mutate(daily_v_distinguish_ID = ifelse(row_number()==1,
n_distinct(v), NA),
daily_v_ID =n())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With