Using by I can get statistics for a desired column based on a factor column.
For instance, if I want to know the ratio of Sepal.Width/Sepal.Length per Species, in the iris dataframe, I'd go like this:
by(iris$Sepal.Width/iris$Sepal.Length, iris$Species, mean)
iris$Species: setosa
[1] 0.6842483
------------------------------------------------------------
iris$Species: versicolor
[1] 0.4676804
------------------------------------------------------------
iris$Species: virginica
[1] 0.4533956
So far so good. Now, the question is, how can I the same but for only a sub-set of levels. e.g setosa and versicolor only?
I have a complex data.frame with thousands of factors. I am playing a bit with table in order to create sub-sets of factors based on different statistics. I would like to then go back to my original data.frame and create more numbers for my desired sub-set of factors.
Thanks
with( droplevels( subset(iris, Species %in% c("setosa", "versicolor") ) ),
by(Sepal.Width/Sepal.Length, Species, mean) )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With