Let's say I have an array of this format
X Y Z
A 1 0
A 2 1
B 1 1
B 2 1
B 1 0
I want to find the frequency of X and the frequency of Y given X, then add them to the array
X Y Z F(x) F(Y|X)
A 1 0 2 1
A 2 1 2 1
B 1 1 3 2
B 2 1 3 1
B 1 0 3 2
Here's a data.table
way:
require(data.table)
DT <- data.table(dat)
DT[,nx:=.N,by=X][,nxy:=.N,by=list(X,Y)]
That last step created the two columns:
DT
# X Y Z nx nxy
# 1: A 1 0 2 1
# 2: A 2 1 2 1
# 3: B 1 1 3 2
# 4: B 2 1 3 1
# 5: B 1 0 3 2
And it could have been written in two lines instead of one:
DT[,nx:=.N,by=X]
DT[,nxy:=.N,by=list(X,Y)]
Using ave
and assuming your data is dat
dat$Fx <- with(dat,ave(Y,list(X),FUN=length))
dat$Fyx <- with(dat,ave(Y,list(X,Y),FUN=length))
Result:
X Y Z Fx Fyx
1 A 1 0 2 1
2 A 2 1 2 1
3 B 1 1 3 2
4 B 2 1 3 1
5 B 1 0 3 2
If the data doesn't have a numeric column for ave
to work on, then:
dat$Fx <- with(dat,ave(seq_len(nrow(dat)),list(X),FUN=length))
dat$Fyx <- with(dat,ave(seq_len(nrow(dat)),list(X,Y),FUN=length))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With