Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create missing values in table in R?

I have 40 pairs of birds with each male and female in the pair scored for their colour. The colour score is a categorical variable with a value range of 1 to 9. I would like to create a table with the number of each combination (1/1, 1/2, 1/3, ... 9/7, 9/8, 9/9). My problem is that there are some combinations that do not exist in my data when I try to create the table (in these cases I would like zeros for the missing values). Below is the data and sample code. I am pretty sure the answer lies in using the 'expand.grid()' command, e.g. see this post, but I am unsure how to implement it. Any suggestions?

## Dataset pairs of males and females and their colour classes
Pair_Colours <- structure(list(Male = c(7, 6, 4, 6, 8, 8, 5, 6, 6, 8, 6, 6, 5, 
7, 9, 5, 8, 7, 5, 5, 4, 6, 7, 7, 3, 6, 5, 4, 7, 4, 3, 9, 4, 4, 
4, 4, 9, 6, 6, 6), Female = c(9, 8, 8, 9, 3, 6, 8, 5, 8, 9, 7, 
3, 6, 5, 8, 9, 7, 3, 6, 4, 4, 4, 8, 8, 6, 7, 4, 2, 8, 9, 5, 6, 
8, 8, 4, 4, 5, 9, 7, 8)), .Names = c("Male", "Female"), class = "data.frame", row.names = c(NA, 
40L))

Pair_Colours$Male <- as.factor(Pair_Colours$Male)
Pair_Colours$Female <- as.factor(Pair_Colours$Female)

## table of pair colour values (colours 1 to 9 - categoricial variable)
table(Pair_Colours$Male, Pair_Colours$Female)

## my attempt to create a table with a count of each possible value for pairs
Colour_Male <- rep(seq(1, 9, by = 1), each = 9)
Colour_Female <- rep(seq(1, 9, by = 1), times = 9)
Colour_Count <-  as.vector(table(Pair_Colours$Male, Pair_Colours$Female)) # <- the problem occurs here
Pairs_Colour_Table <- as.data.frame(cbind(cbind(Colour_Male, Colour_Female), Colour_Count))

## plot results to visisually look for possible assortative mating by colour
op<-par(mfrow=c(1,1), oma=c(2,4,0,0), mar=c(4,5,1,2), pty = "s")
plot(1,1, xlim = c(1, 9), ylim = c(1, 9), type="n", xaxt = "n", yaxt = "n", las=1, bty="n", cex.lab = 1.75, cex.axis = 1.5, main = NULL, xlab = "Male Colour", ylab = "Female Colour", pty = "s")
axis(1, at = seq(1, 9, by = 1), labels = T, cex.lab = 1.5, cex.axis = 1.5, tick = TRUE, tck = -0.015, lwd = 1.25, lwd.ticks = 1.25)
axis(2, at = seq(1, 9, by = 1), labels = T, cex.lab = 1.5, cex.axis = 1.5, tick = TRUE, tck = -0.015, lwd = 1.25, lwd.ticks = 1.25, las =2)
points(Pair_Colours$Male, Pair_Colours$Female, pch = 21, cex = Pairs_Colour_Table$Colour_Count, bg = "darkgray", col = "black", lwd = 1)
like image 287
Keith W. Larson Avatar asked Oct 05 '22 05:10

Keith W. Larson


1 Answers

You just have to convert your Pair_Colours to a factor with all required levels before calling table:

# Convert each column to factor with levels 1 to 9
Pair_Colours[] <- lapply(Pair_Colours, factor, levels=1:9)
table(Pair_Colours$Male, Pair_Colours$Female)
#     1 2 3 4 5 6 7 8 9
#   1 0 0 0 0 0 0 0 0 0
#   2 0 0 0 0 0 0 0 0 0
#   3 0 0 0 0 1 1 0 0 0
#   4 0 1 0 3 0 0 0 3 1
#   5 0 0 0 2 0 2 0 1 1
#   6 0 0 1 1 1 0 3 3 2
#   7 0 0 1 0 1 0 0 3 1
#   8 0 0 1 0 0 1 1 0 1
#   9 0 0 0 0 1 1 0 1 0

You can convert with a as.data.frame if you want the format to be "combn1, combn2, frequency".

like image 108
Arun Avatar answered Oct 13 '22 11:10

Arun