I sometimes use the ftable
function purely for its presentation of hierarchical categories. However, sometimes, when the table is large, I would like to further subset the table before using it.
Let's say we're starting with:
mytable <- ftable(Titanic, row.vars = 1:3)
mytable
## Survived No Yes
## Class Sex Age
## 1st Male Child 0 5
## Adult 118 57
## Female Child 0 1
## Adult 4 140
## 2nd Male Child 0 11
## Adult 154 14
## Female Child 0 13
## Adult 13 80
## 3rd Male Child 35 13
## Adult 387 75
## Female Child 17 14
## Adult 89 76
## Crew Male Child 0 0
## Adult 670 192
## Female Child 0 0
## Adult 3 20
str(mytable)
## ftable [1:16, 1:2] 0 118 0 4 0 154 0 13 35 387 ...
## - attr(*, "row.vars")=List of 3
## ..$ Class: chr [1:4] "1st" "2nd" "3rd" "Crew"
## ..$ Sex : chr [1:2] "Male" "Female"
## ..$ Age : chr [1:2] "Child" "Adult"
## - attr(*, "col.vars")=List of 1
## ..$ Survived: chr [1:2] "No" "Yes"
## NULL
Because there are no dimnames
, I can't extract data in the same way that I would with an object that has dimnames
. For instance, there's no way for me to directly extract all the "Child" values from the "1st" and "3rd" classes.
My current approach is to convert it to a table
, do the extraction, and then convert it back to an ftable
.
Example:
mytable[c("1st", "3rd"), , "Child", ]
## Error: incorrect number of dimensions
## Only the underlying data are seen as having dims
dim(mytable)
## [1] 16 2
## I'm OK with the "Age" column being dropped in this case....
ftable(as.table(mytable)[c("1st", "3rd"), , "Child", ])
## Survived No Yes
## Class Sex
## 1st Male 0 5
## Female 0 1
## 3rd Male 35 13
## Female 17 14
However, I don't like this approach because the overall layout sometimes changes if you're not careful. Compare it with the following, which removes the requirement of subsetting only children and adds the requirement of subsetting only those who did not survive:
ftable(as.table(mytable)[c("1st", "3rd"), , , "No"])
## Age Child Adult
## Class Sex
## 1st Male 0 118
## Female 0 4
## 3rd Male 35 387
## Female 17 89
I don't like that the overall layout of rows and columns has changed. That's a classic case of having to remember to use drop = FALSE
to maintain dimensions when a single column is extracted:
ftable(as.table(mytable)[c("1st", "3rd"), , , "No", drop = FALSE])
## Survived No
## Class Sex Age
## 1st Male Child 0
## Adult 118
## Female Child 0
## Adult 4
## 3rd Male Child 35
## Adult 387
## Female Child 17
## Adult 89
I know there are many ways to get the data that I want, starting with subsetting from the raw data and then making my ftable
, but for this question, let's assume that's not possible.
The end goal is to have an approach that lets me extract from an ftable
preserving the display format of the nested "row" hierarchy.
Are there other solutions to this? Can we make use of the row.vars
and col.vars
attributes to extract data from an ftable
and retain its formatting?
My current approach also doesn't work for hierarchical columns, so I'm hoping that the proposed solution can also handle those cases.
Example:
tab2 <- ftable(Titanic, row.vars = 1:2, col.vars = 3:4)
tab2
## Age Child Adult
## Survived No Yes No Yes
## Class Sex
## 1st Male 0 5 118 57
## Female 0 1 4 140
## 2nd Male 0 11 154 14
## Female 0 13 13 80
## 3rd Male 35 13 387 75
## Female 17 14 89 76
## Crew Male 0 0 670 192
## Female 0 0 3 20
Note the nesting of "Age" and "Survived".
Try my current approach:
ftable(as.table(tab2)[c("1st", "3rd"), , , , drop = FALSE])
## Survived No Yes
## Class Sex Age
## 1st Male Child 0 5
## Adult 118 57
## Female Child 0 1
## Adult 4 140
## 3rd Male Child 35 13
## Adult 387 75
## Female Child 17 14
## Adult 89 76
I can get back to what I want with:
ftable(as.table(tab2)[c("1st", "3rd"), , , , drop = FALSE], row.vars = 1:2, col.vars = 3:4)
But I'm hoping for something more direct.
The attribute table displays information on features of a selected layer. Each row in the table represents a feature (with or without geometry), and each column contains a particular piece of information about the feature. Features in the table can be searched, selected, moved or even edited.
The most efficient method for extracting data is a process called ETL. Short for “extract, transform, load,” ETL tools pull data from the various platforms you use and prepare it for analysis. The only alternative to ETL is manual data entry — which can take literal months, even with an enterprise amount of manpower.
Hi @Simran, if you want to extract data from a table on a webpage and display it, you can use Extract Structured Data activity in UiPath studio.
Here's what I was able to sort of hack together, with some help from Axeman:
replace_empty_arguments <- function(a) {
empty_symbols <- vapply(a, function(x) {
is.symbol(x) && identical("", as.character(x)), 0)
}
a[!!empty_symbols] <- 0
lapply(a, eval)
}
`[.ftable` <- function (inftable, ...) {
if (!class(inftable) %in% "ftable") stop("input is not an ftable")
tblatr <- attributes(inftable)[c("row.vars", "col.vars")]
valslist <- replace_empty_arguments(as.list(match.call()[-(1:2)]))
x <- sapply(valslist, function(x) identical(x, 0))
TAB <- as.table(inftable)
valslist[x] <- dimnames(TAB)[x]
temp <- as.matrix(expand.grid(valslist))
out <- ftable(
`dimnames<-`(`dim<-`(TAB[temp], lengths(valslist)), valslist),
row.vars = seq_along(tblatr[["row.vars"]]),
col.vars = seq_along(tblatr[["col.vars"]]) + length(tblatr[["row.vars"]]))
names(attributes(out)[["row.vars"]]) <- names(tblatr[["row.vars"]])
names(attributes(out)[["col.vars"]]) <- names(tblatr[["col.vars"]])
out
}
Try it out with the examples from the question:
mytable[c("1st", "3rd"), , "Child", ]
## Survived No Yes
## Class Sex Age
## 1st Male Child 0 5
## Female Child 0 1
## 3rd Male Child 35 13
## Female Child 17 14
mytable[c("1st", "3rd"), , , "No"]
## Survived No
## Class Sex Age
## 1st Male Child 0
## Adult 118
## Female Child 0
## Adult 4
## 3rd Male Child 35
## Adult 387
## Female Child 17
## Adult 89
tab2[c("1st", "3rd"), , , ]
## Age Child Adult
## Survived No Yes No Yes
## Class Sex
## 1st Male 0 5 118 57
## Female 0 1 4 140
## 3rd Male 35 13 387 75
## Female 17 14 89 76
Once the data is aggregated to frequencies by combination of factors as is the case with the Titanic
data set, it is arguably easier to subset the raw data and tabulate it for display rather than manipulating the output object.
I recognize that the OP asks for solutions using ftable
, but with the back and forth in the comments section soliciting other ideas, I thought I'd post a different take on this question because it illustrates a way to simultaneously subset the data and generate the hierarchical structure of the contingency tables without custom functions.
Here is an approach using the tables
package that preserves the hierarchical structure of the Titanic
data, as well as eliminating cells that are empty when we subset the data frame.
First we cast the incoming table as a data frame so we can subset it during the tabular()
function.
library(titanic)
df <- as.data.frame(Titanic)
Then we use tables::tabular()
while subsetting the data in the data=
argument with the extract operator [
, and use DropEmpty()
to avoid printing rows and columns where Freq == 0
. We also use Heading()
to suppress unwanted headings for Freq
and sum
.
tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
...and the output:
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
Age
Child
Survived
Class Sex No Yes
1st Male 0 5
Female 0 1
3rd Male 35 13
Female 17 14
If we remove DropEmpty()
, we replicate the entire tabular structure based on the factor variables in the table.
> # remove DropEmpty() to replicate entire factor structure
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum,
+ data=df[df$Class %in% c("1st","3rd") & df$Age=="Child",])
Age
Child Adult
Survived Survived
Class Sex No Yes No Yes
1st Male 0 5 0 0
Female 0 1 0 0
2nd Male 0 0 0 0
Female 0 0 0 0
3rd Male 35 13 0 0
Female 17 14 0 0
Crew Male 0 0 0 0
Female 0 0 0 0
>
Replicating the second and third examples from the OP is also straightforward.
> # second example from question
> tabular((Class * Sex * Age) ~ Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd") & df$Survived=="No",])
Survived
Class Sex Age No
1st Male Child 0
Adult 118
Female Child 0
Adult 4
3rd Male Child 35
Adult 387
Female Child 17
Adult 89
> # third example from question
> tabular((Class * Sex) ~ (Age)*Survived*Heading()*Freq*Heading()*sum*DropEmpty(empty=0),
+ data=df[df$Class %in% c("1st","3rd"),])
Age
Child Adult
Survived Survived
Class Sex No Yes No Yes
1st Male 0 5 118 57
Female 0 1 4 140
3rd Male 35 13 387 75
Female 17 14 89 76
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With