I have a dataframe which consists of two columns with categorical variables (Better, Similar, Worse). I would like to come up with a table which counts the number of times that these categories appear in the two columns. The dataframe I am using is as follows: <pre class="prettyprint"><code> Category.x Category.y 1 Better Better 2 Better Better 3 Similar Similar 4 Worse Similar </code></pre> I would like to come up with a table like this: <pre class="prettyprint"><code> Category.x Category.y Better 2 2 Similar 1 2 Worse 1 0 </code></pre> How would you go about it?

As mentioned in the comments, <code>table</code> is standard for this, like <pre class="prettyprint"><code>table(stack(DT)) ind values Category.x Category.y Better 2 2 Similar 1 2 Worse 1 0 </code></pre> or <pre class="prettyprint"><code>table(value = unlist(DT), cat = names(DT)[col(DT)]) cat value Category.x Category.y Better 2 2 Similar 1 2 Worse 1 0 </code></pre> or <pre class="prettyprint"><code>with(reshape(DT, direction = "long", varying = 1:2), table(value = Category, cat = time) ) cat value x y Better 2 2 Similar 1 2 Worse 1 0 </code></pre>

How to aggregate categorical data in R?

Tags:

r

aggregate

I have a dataframe which consists of two columns with categorical variables (Better, Similar, Worse). I would like to come up with a table which counts the number of times that these categories appear in the two columns. The dataframe I am using is as follows:

       Category.x  Category.y
1      Better      Better
2      Better      Better
3      Similar     Similar
4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y
Better     2             2
Similar    1             2
Worse      1             0

How would you go about it?

884

asked Apr 02 '19 16:04

Daniel

2 Answers

As mentioned in the comments, table is standard for this, like

table(stack(DT))

         ind
values    Category.x Category.y
  Better           2          2
  Similar          1          2
  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])

         cat
value     Category.x Category.y
  Better           2          2
  Similar          1          2
  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 
  table(value = Category, cat = time)
)

         cat
value     x y
  Better  2 2
  Similar 1 2
  Worse   1 0

112

answered Nov 08 '22 08:11

Frank

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))
#        Category.x Category.y
#Better           2          2
#Similar          1          2
#Worse            1          0

answered Nov 08 '22 08:11

d.b

Related questions
                            
                                How to detect an empty quosure in rlang?
                            
                                polygons from coordinates
                            
                                R h2o load a saved model from disk in MOJO or POJO format
                            
                                Relative image paths for Twitter cards in blogdown
                            
                                Find overlapping dates for each ID and create a new row for the overlap
                            
                                shiny dashboard mainpanel height issue
                            
                                Horizontal legend with title on top in ggplot
                            
                                Functional programming with dplyr
                            
                                R time_trans works with objects of class POSIXct
                            
                                How to change colors on barplot?
                            
                                data.table avoid recycling
                            
                                How to group by in base R
                            
                                Filter the middle row of each group
                            
                                Use select_helpers with dplyr::coalesce
                            
                                Replace column values with column name using dplyr's transmute_all
                            
                                Create a new column based on column that does not yet exist
                            
                                Draw border around certain rows using cowplot and ggplot2
                            
                                How to correctly use group_by() and summarise() in a For loop in R
                            
                                wrap text in knitr::kable table cell using "\n"
                            
                                Error in contrib.url(repos, "source") in R trying to use CRAN without setting a mirror Calls: install.packages -> contrib.url Execution halted

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With