Let's say I have
az<-data.table(a=1:6,b=6:1,c=4)
az[b==4,c:=NA]
az
a b c
1: 1 6 4
2: 2 5 4
3: 3 4 NA
4: 4 3 4
5: 5 2 4
6: 6 1 4
I can get the sum of all the columns with
az[,lapply(.SD,sum)]
a b c
1: 21 21 NA
This is what I want for a
and b
but c
is NA. This is seemingly easy enough to fix by doing
az[,lapply(na.omit(.SD),sum)]
a b c
1: 18 17 20
This is what I want for c
but I didn't want to omit the values of a
and b
where c
is NA
. This is a contrived example in my real data there could be 1000+ columns with random NAs throughout. Is there a way to get na.omit
or something else to act per column instead of on the whole table without relying on looping through each column as a vector?
Expanding on my comment:
Many base
functions allow you to decide how to treat NA
. For example, sum
has the argument na.rm
:
az[,lapply(.SD,sum,na.rm=TRUE)]
In general, you can also use the function na.omit
on each vector individually:
az[,lapply(.SD,function(x) sum(na.omit(x)))]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With