Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a general inverse of the table() function?

Tags:

r

I am aware that a little programming allows converting fixed-dimension frequency tables, as returned e.g. by table(), back into observation data. So the aim is to convert a frequency table such as this one...

(flower.freqs <- with(iris,table(Petal=cut(Petal.Width,2),Species)))
          Species
Petal          setosa versicolor virginica
  (0.0976,1.3]     50         28         0
  (1.3,2.5]         0         22        50

...back into a data.frame() with a row number that corresponds to the sum of the numbers of the input matrix, while the cell values are obtained from input dimensions:

     Petal Species
1 (0.0976,1.3]  setosa
2 (0.0976,1.3]  setosa
3 (0.0976,1.3]  setosa
# ... (150 rows) ...

With some tinkering I build a rough prototype that should also digest higher-dimensional inputs:

tableinv <- untable <- function(x) {
    stopifnot(is.table(x))
    obs <- as.data.frame(x)[rep(1:prod(dim(x)),c(x)),-length(dim(x))-1]
    rownames(obs) <- NULL; obs
}

> head(tableinv(flower.freqs)); dim(tableinv(flower.freqs))
     Petal Species
1 (0.0976,1.3]  setosa
2 (0.0976,1.3]  setosa
3 (0.0976,1.3]  setosa
4 (0.0976,1.3]  setosa
5 (0.0976,1.3]  setosa
6 (0.0976,1.3]  setosa
[1] 150   2
> head(tableinv(Titanic)); nrow(tableinv(Titanic))==sum(Titanic)
  Class  Sex   Age Survived
1   3rd Male Child       No
2   3rd Male Child       No
3   3rd Male Child       No
4   3rd Male Child       No
5   3rd Male Child       No
6   3rd Male Child       No
[1] TRUE

I am obviously proud that this bricolage reconstructs multi-attribute data.frame()s from higher-dimensional frequency tables such as Titanic - but is there an established (built-in, battle-tested) general inverse to table(), ideally one that does not depend on a specific library, that knows how to handle unlabeled dimensions, that is optimized so that it will not choke on bulky inputs, and that reasonably deals with table inputs that would correspond to factor as well as non-factor observation inputs?

like image 700
texb Avatar asked May 27 '15 11:05

texb


People also ask

Which functions do not have an inverse?

Horizontal Line Test Let f be a function. If any horizontal line intersects the graph of f more than once, then f does not have an inverse.


1 Answers

I believe that your solution is pretty good. In any case, the way I would address this question is quite similar:

tableinv <- function(x){
      y <- x[rep(rownames(x),x$Freq),1:(ncol(x)-1)]
      rownames(y) <- c(1:nrow(y))
      return(y)}
survivors <- as.data.frame(Titanic)
surv.invtab <- tableinv(survivors)

which yields

> head(surv.invtab)
  Class  Sex   Age Survived
1   3rd Male Child       No
2   3rd Male Child       No
3   3rd Male Child       No
4   3rd Male Child       No
5   3rd Male Child       No
6   3rd Male Child       No

Concerning the example with the flowers, using the function tableinv() as defined above, it would first be necessary to convert the data into a data frame:

flower.freqs <- with(iris,table(Petal=cut(Petal.Width,2),Species))
flower.freqs <- as.data.frame(flower.freqs)
flower.invtab <- tableinv(flower.freqs)

The result in this case is

> head(flower.invtab)
         Petal Species
1 (0.0976,1.3]  setosa
2 (0.0976,1.3]  setosa
3 (0.0976,1.3]  setosa
4 (0.0976,1.3]  setosa
5 (0.0976,1.3]  setosa
6 (0.0976,1.3]  setosa

Hope this helps.

like image 188
RHertel Avatar answered Sep 22 '22 07:09

RHertel