Let's say I have an array of this format
X  Y  Z
A  1  0
A  2  1
B  1  1
B  2  1
B  1  0
I want to find the frequency of X and the frequency of Y given X, then add them to the array
X  Y  Z  F(x)  F(Y|X)
A  1  0  2     1
A  2  1  2     1
B  1  1  3     2
B  2  1  3     1
B  1  0  3     2
                Here's a data.table way:
require(data.table)
DT <- data.table(dat)
DT[,nx:=.N,by=X][,nxy:=.N,by=list(X,Y)]
That last step created the two columns:
DT
#    X Y Z nx nxy
# 1: A 1 0  2   1
# 2: A 2 1  2   1
# 3: B 1 1  3   2
# 4: B 2 1  3   1
# 5: B 1 0  3   2
And it could have been written in two lines instead of one:
DT[,nx:=.N,by=X]
DT[,nxy:=.N,by=list(X,Y)]
                        Using ave and assuming your data is dat
dat$Fx <-  with(dat,ave(Y,list(X),FUN=length))
dat$Fyx <- with(dat,ave(Y,list(X,Y),FUN=length))
Result:
  X Y Z Fx Fyx
1 A 1 0  2   1
2 A 2 1  2   1
3 B 1 1  3   2
4 B 2 1  3   1
5 B 1 0  3   2
If the data doesn't have a numeric column for ave to work on, then:
dat$Fx <-  with(dat,ave(seq_len(nrow(dat)),list(X),FUN=length))
dat$Fyx <- with(dat,ave(seq_len(nrow(dat)),list(X,Y),FUN=length))
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With