I'm working through the examples in Kruschke's Doing Bayesian Data Analysis
and need a bit of help understanding how to get data into the format that his code examples require. In chapter 22 he has a table like this
Blue Brown Green Hazel
Black 20 68 5 15
Blond 94 7 16 10
Brunette 84 119 29 54
Red 17 26 14 14
I'm comfortable with inputting the table into R
by entering it into a spreadsheet and using read.table("clipboard", header=T, sep="\t")
or typing it into R
like this
con.table2 <- matrix(c(20,68,5,15,94,7,16,10,84,119,29,54,17,26,14,14),nrow=4,byrow=TRUE)
dimnames(con.table2) <- list(c("Black","Blond","Brunette","Red"),c("Blue","Brown","Green","Hazel"))
But in his code, he presents this table like so, ready for analysis (full code is here http://www.indiana.edu/~kruschke/DoingBayesianDataAnalysis/Programs/PoissonExponentialJagsSTZ.R)
Freq = c(68,119,26,7,20,84,17,94,15,54,14,10,5,29,14,16)
Eye = c("Brown","Brown","Brown","Brown","Blue","Blue","Blue","Blue","Hazel" # runs off the page of his book
Hair = c("Black","Brunette","Red","Blond","Black","Brunette","Red","Blond","Black" # runs off the page of his book
It looks like the table has been converted into three vectors. What's the most efficient way to do this? I'd like to replace his data with my own, so it would be great to learn how to transform the data into the format needed for this analysis.
For this, I'd use melt()
in the reshape2
package:
library(reshape2)
df <- melt(con.table2, varnames=c("Hair", "Eye"), value.name="Freq")
# df is a data frame, a list from which you can easily extract the
# component vectors "Hair", "Eye", and "Freq.
# Try, for example:
str(df)
df$Hair
There is a method in base R for converting objects of class "table" to data.frames. The reason that it does not succeed with your matrix is that you didn't tell R that it was a table. Once you do so the method succeeds:
class(con.table2) <- "table"
as.data.frame(con.table2)
#-----------------------
Var1 Var2 Freq
1 Black Blue 20
2 Blond Blue 94
3 Brunette Blue 84
4 Red Blue 17
5 Black Brown 68
6 Blond Brown 7
7 Brunette Brown 119
8 Red Brown 26
9 Black Green 5
10 Blond Green 16
11 Brunette Green 29
12 Red Green 14
13 Black Hazel 15
14 Blond Hazel 10
15 Brunette Hazel 54
16 Red Hazel 14
The "table" class in R is expected to be a contingency table (just as you have constructed), i.e, one with counts in cells. In this case you could have had fractional values in there and there would be no problems but some methods that were expecting the values to be integer might choke on non-integer values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With