Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the table counts for unique values in column

Tags:

r

I have this matrix mymat. I know I can do table((mymat[,"col1"]) to get the number of each item in col1. However, I only want the count if there are unique values in col2. For mymat below, I want this result:

app  gg  chh
 1    2   1

mymat

col1   col2
app    d
app    d
gg     e
gg     f 
gg     e
chh    f
chh    f
chh    f
like image 501
MAPK Avatar asked Mar 15 '16 15:03

MAPK


People also ask

How do I count unique values in a column in SQL?

To count the number of different values that are stored in a given column, you simply need to designate the column you pass in to the COUNT function as DISTINCT . When given a column, COUNT returns the number of values in that column. Combining this with DISTINCT returns only the number of unique (and non-NULL) values.

How do I use Countif with unique?

Using SUM, IF, and COUNTIF Functions. In Excel, functions are always available to solve any operations. In this case, you can use a combination of SUM, IF and COUNTIF functions to count unique values in Excel. To count unique values, enter the formula =SUM(IF(COUNTIF(range, range)=1,1,0)) in the desired cell.

How do I get a list of unique values in a column?

In Excel, there are several ways to filter for unique values—or remove duplicate values: To filter for unique values, click Data > Sort & Filter > Advanced. To remove duplicate values, click Data > Data Tools > Remove Duplicates.


3 Answers

You can use unique to subset the data (works for matrix and data.frame) and then call table:

table(unique(mymat)[,1])

This returns

# app chh  gg 
#   1   1   2 
like image 188
symbolrush Avatar answered Nov 09 '22 06:11

symbolrush


You can use duplicated to subset the data and then call table:

table(subset(df, !duplicated(paste(col1, col2)), select = col1))
#app chh  gg 
#  1   1   2 

As a second option, here's a dplyr approach:

library(dplyr)
distinct(df) %>% count(col1)   # or distinct(df, col1, col2) if you have other columns
#Source: local data frame [3 x 2]
#
#    col1     n
#  (fctr) (int)
#1    app     1
#2    chh     1
#3     gg     2
like image 28
talat Avatar answered Nov 09 '22 07:11

talat


This is counting the nonzeros in table()-result

rowSums(table(df$col1, df$col2)!=0)

result:

app chh  gg 
  1   1   2 

data used:

df <- read.table(header=TRUE, text=
"col1   col2
app    d
app    d
gg     e
gg     f 
gg     e
chh    f
chh    f
chh    f")
like image 35
jogo Avatar answered Nov 09 '22 06:11

jogo