Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conditional labeling in ggplot2 using geom_text and subsetting

Tags:

r

ggplot2

I'm new to the community and I'm posting this after giving multiple attempts at this while searching the solutions mentioned online. However, I haven't been able to resolve it.

The following code

dat<-read.csv("Harvard tutorial/Rgraphics/dataSets/EconomistData.csv")
g <- ggplot(dat, aes(dat$CPI, dat$HDI))

g1 <- g + theme_bw() + geom_smooth(method = "lm", formula = y ~log(x), se = FALSE, color = "Red", linetype = 1, weight = 3) +
  geom_point(aes(color = Region), size = 4, fill = 4, alpha = 1/2, shape = 1) +
  scale_x_continuous(name = "Corruption Perception Index", breaks = NULL) +
  scale_y_continuous(name = "Human Development Index") +
  scale_color_manual(name = "Region of the world", values = c("#24576D", "#099DD7", "#28AADC", "#248E84", "#F2583F", "#96503F")) + 
  theme(axis.text.x = element_text(angle = 90, size = 15))

This gives me the following result:

enter image description here

However, when I add the following lines to the code

pointsToLabel <- c("Russia", "Venezuela", "Iraq", "Myanmar", "Sudan",
                   "Afghanistan", "Congo", "Greece", "Argentina", "Brazil",
                   "India", "Italy", "China", "South Africa", "Spane",
                   "Botswana", "Cape Verde", "Bhutan", "Rwanda", "France",
                   "United States", "Germany", "Britain", "Barbados", "Norway", "Japan",
                   "New Zealand", "Singapore")

g2 <- g1 + geom_text(aes(dat$CPI, dat$HDI, label = dat$Country), data = subset(x = dat,subset =  Country %in% pointsToLabel))

I get the following error

Error: Aesthetics must be either length 1 or the same as the data (27): x, y, label

Can someone help me with this?

The data is sourced from Harvard Tutorial on GGPLOT2

For your info, the structure of the dataset is as follows

'data.frame':   173 obs. of  6 variables:
 $ X       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Country : Factor w/ 173 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ HDI.Rank: int  172 70 96 148 45 86 2 19 91 53 ...
 $ HDI     : num  0.398 0.739 0.698 0.486 0.797 0.716 0.929 0.885 0.7 0.771 ...
 $ CPI     : num  1.5 3.1 2.9 2 3 2.6 8.8 7.8 2.4 7.3 ...
 $ Region  : Factor w/ 6 levels "Americas","Asia Pacific",..: 2 3 5 6 1 3 2 4 3 1 ...
like image 585
mvikred Avatar asked Apr 11 '16 18:04

mvikred


1 Answers

First of all, don't use $ in aes(). Then, to fix the subsetting part try..

g2 <- g1 + 
  geom_text(aes(CPI, HDI, label = Country), data = dat[dat$Country %in% pointsToLabel,])

.. hopefully resulting in your desired plot: enter image description here

like image 119
erc Avatar answered Sep 18 '22 22:09

erc