I am aware that a little programming allows converting fixed-dimension frequency tables, as returned e.g. by table()
, back into observation data. So the aim is to convert a frequency table such as this one...
(flower.freqs <- with(iris,table(Petal=cut(Petal.Width,2),Species)))
Species
Petal setosa versicolor virginica
(0.0976,1.3] 50 28 0
(1.3,2.5] 0 22 50
...back into a data.frame()
with a row number that corresponds to the sum of the numbers of the input matrix, while the cell values are obtained from input dimensions:
Petal Species
1 (0.0976,1.3] setosa
2 (0.0976,1.3] setosa
3 (0.0976,1.3] setosa
# ... (150 rows) ...
With some tinkering I build a rough prototype that should also digest higher-dimensional inputs:
tableinv <- untable <- function(x) {
stopifnot(is.table(x))
obs <- as.data.frame(x)[rep(1:prod(dim(x)),c(x)),-length(dim(x))-1]
rownames(obs) <- NULL; obs
}
> head(tableinv(flower.freqs)); dim(tableinv(flower.freqs))
Petal Species
1 (0.0976,1.3] setosa
2 (0.0976,1.3] setosa
3 (0.0976,1.3] setosa
4 (0.0976,1.3] setosa
5 (0.0976,1.3] setosa
6 (0.0976,1.3] setosa
[1] 150 2
> head(tableinv(Titanic)); nrow(tableinv(Titanic))==sum(Titanic)
Class Sex Age Survived
1 3rd Male Child No
2 3rd Male Child No
3 3rd Male Child No
4 3rd Male Child No
5 3rd Male Child No
6 3rd Male Child No
[1] TRUE
I am obviously proud that this bricolage reconstructs multi-attribute data.frame()
s from higher-dimensional frequency tables such as Titanic
- but is there an established (built-in, battle-tested) general inverse to table(), ideally one that does not depend on a specific library, that knows how to handle unlabeled dimensions, that is optimized so that it will not choke on bulky inputs, and that reasonably deals with table inputs that would correspond to factor as well as non-factor observation inputs?
Horizontal Line Test Let f be a function. If any horizontal line intersects the graph of f more than once, then f does not have an inverse.
I believe that your solution is pretty good. In any case, the way I would address this question is quite similar:
tableinv <- function(x){
y <- x[rep(rownames(x),x$Freq),1:(ncol(x)-1)]
rownames(y) <- c(1:nrow(y))
return(y)}
survivors <- as.data.frame(Titanic)
surv.invtab <- tableinv(survivors)
which yields
> head(surv.invtab)
Class Sex Age Survived
1 3rd Male Child No
2 3rd Male Child No
3 3rd Male Child No
4 3rd Male Child No
5 3rd Male Child No
6 3rd Male Child No
Concerning the example with the flowers, using the function tableinv()
as defined above, it would first be necessary to convert the data into a data frame:
flower.freqs <- with(iris,table(Petal=cut(Petal.Width,2),Species))
flower.freqs <- as.data.frame(flower.freqs)
flower.invtab <- tableinv(flower.freqs)
The result in this case is
> head(flower.invtab)
Petal Species
1 (0.0976,1.3] setosa
2 (0.0976,1.3] setosa
3 (0.0976,1.3] setosa
4 (0.0976,1.3] setosa
5 (0.0976,1.3] setosa
6 (0.0976,1.3] setosa
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With