Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tabulate top n most repeated values including others

Tags:

r

I have created a table that includes a list of 1000 songs organized by gender,theme, etc. I want to know how many years are repeated and how many are in an other category. I have tried:

sort(summary(as.factor(canciones$YEAR)), decreasing=T)[1:3]

And the output is:

1968 1966 1979 
  39   37   34 

But I want it to be

1968 1966 1979 Others
  39   37   34    950
like image 933
Alvaro Blanco Avatar asked Dec 15 '15 18:12

Alvaro Blanco


1 Answers

Here are some sample data.

set.seed(1)
x <- sample(10, 500, TRUE)

We can run the entire summary, subset the first three, then calculate the remaining values as "Others" and tack it on the end. Additionally, I think you can just use table() instead of summary(factor()) since summary.factor() does this under the hood anyway.

xx <- sort(table(x), decreasing = TRUE)
c(xx[1:3], Others = sum(xx[-(1:3)]))
#     5      2      4 Others 
#    64     61     57    318 

Note: It may or may not be faster to use Others = length(x) - sum(xx[1:3]).

like image 57
Rich Scriven Avatar answered Nov 15 '22 01:11

Rich Scriven