Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group values that occur only once into an OTHER field

Tags:

r

ggplot2

I've got a collection of Roman Coins that I'm trying to represent using ggplot in R.

There are 25 different denominations in the data, but I'd like to merge together all denominations that only have one occurrence into an 'OTHER' field, so that the graphs are easier to read.

  Medium Method Denom            Date                  Era
1 Silver Struck Denarius         112 B.C.E.:111 B.C.E. Period V – c. 119-91 B.C.E.
2 Bronze   Cast Χαλκα μεγεθους   181 B.C.E.:174 B.C.E. Period III – c. 187-155 B.C.E.
3 Bronze Struck Litra:Half-litra            269 B.C.E. Period I – 269 - c. 222 B.C.E.
4 Bronze Struck Litra:Half-litra            269 B.C.E. Period I – 269 - c. 222 B.C.E.
5 Silver Struck Didrachm         275 B.C.E.:270 B.C.E. Period I – 269 - c. 222 B.C.E.
6 Bronze Struck Double-litra     275 B.C.E.:270 B.C.E. Period I – 269 - c. 222 B.C.E.

Using the data.frame sample above, the "Denom" column needs to have every value that occurs only one time grouped together and displayed as "other." I think I should do this in the data before starting the plot. Please point me in the right direction.

Here's the code I'm using for ggplot if that helps.

ggplot(data=longbadian, aes(x=Era, fill=Denom)) 
+    geom_bar(aes(x=Era2), data = longbadian, stat="bin") 
+ theme(axis.text.x =  element_text(angle=75, hjust=1), 
                       legend.title=element_blank()) 
+ xlab("Sydenham Periods") 
+ ylab("Coins by Denomination")

Here's an example graph:

enter image description here

like image 525
David Beales Avatar asked Jun 01 '15 15:06

David Beales


People also ask

How do I list only once in SQL?

You can use distinct keyword to select all values from a table only once if they are repeated. select distinct yourColumnName from yourTableName; To understand the above syntax, let us create a table. The query to create a table is as follows.

Can you GROUP BY only one column in SQL?

Group By One Column We generally use the GROUP BY clause with the SELECT statement, WHERE clause, and ORDER BY clauses. The group by single column places all the records (rows) having the same value of only a particular column into one group.

What does GROUP BY 1 do in SQL?

Group by is one of the most frequently used SQL clauses. It allows you to collapse a field into its distinct values. This clause is most often used with aggregations to show one value per grouped field or combination of fields.

How do I GROUP BY a particular column in SQL?

Syntax: SELECT column1, function_name(column2) FROM table_name WHERE condition GROUP BY column1, column2 HAVING condition ORDER BY column1, column2; function_name: Name of the function used for example, SUM() , AVG(). table_name: Name of the table. condition: Condition used.


1 Answers

Something like this:

## example data
dd <- data.frame(DENOM=rep(LETTERS[1:7],c(10,5,4,rep(1,4))))
tt <- table(dd$DENOM)                  ## count occurrences
singletons <- names(tt)[tt==1]         ## find singletons
tmpc <- as.character(dd$DENOM)         ## convert from factor to char
tmpc[tmpc %in% singletons] <- "OTHER"  ## replace values
dd$DENOM <- factor(tmpc)               ## convert back to factor

The only problem with this solution is that it will mess up any pre-existing non-default ordering in your DENOM factor.

like image 80
Ben Bolker Avatar answered Oct 15 '22 01:10

Ben Bolker