Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r - ggplot2 - highlighting selected points and strange behavior

Tags:

r

ggplot2

I want to highlight selected points and encountered some strange behaviour. First some dummy data:

a <- 1:50
b <- rnorm(50)
mydata <- data.frame(a=a,b=b)
ggplot(mydata,aes(x=a,y=b)) + geom_point()

This works correctly. Now,to highlight some points, I add another geom_point layer:

ggplot(mydata[20:40,],aes(x=a,y=b)) + 
    geom_point() + 
    geom_point(aes(x=a[c(10,12,13)],y=b[c(10,12,13)]),colour="red")

Note that I am displaying only a limited range of the data ([20:40]). Now comes the strange behavior:

ggplot(mydata[10:40,],aes(x=a,y=b)) + 
    geom_point() + 
    geom_point(aes(x=a[c(10,12,13)],y=b[c(10,12,13)]),colour="red")

Changing the size of the selected range, I get an error, roughly translated from German: Error...: Arguments implying different number of rows. Strangely, this varies with the selected range. [23:40] will work, [22:40] won't.


The error in English is:

Error in data.frame(x = c(19L, 21L, 22L), y = c(0.28198, -0.6215,  : 
  arguments imply differing number of rows: 3, 31
like image 696
lambu0815 Avatar asked Jul 13 '12 09:07

lambu0815


People also ask

How do I highlight specific points in R?

We can use the new data frame containing the data points to be highlighted to add another layer of geom_point(). Note that we have two geom_point(), one for all the data and the other for with data only for the data to be highlighted.

How do I color a specific point in R?

Output: Now to change the colors of a scatterplot using plot(), simply select the column on basis of which different colors should be assigned to various points. Pass the column that will help differentiate between points to “col” attribute.

What does geom_point () do when used with Ggplot ()?

The function geom_point() adds a layer of points to your plot, which creates a scatterplot. ggplot2 comes with many geom functions that each add a different type of layer to a plot.


3 Answers

If your data is different between different layers, then you need to specify the new data for each layer.

You do this with the data=... argument for each geom that needs different data:

set.seed(1)
mydata <- data.frame(a=1:50, b=rnorm(50))
ggplot(mydata,aes(x=a,y=b)) + 
  geom_point(colour="blue") +
  geom_point(data=mydata[10:13, ], aes(x=a, y=b), colour="red", size=5)

enter image description here

like image 186
Andrie Avatar answered Nov 11 '22 14:11

Andrie


Another option adding the conditions for both attributes, colour and size, inside geom_point. Then we control manually those using scale_colour_manual and scale_size_manual respectively.

set.seed(1)
mydata <- data.frame(a = 1:50, b = rnorm(50))
ggplot(mydata) + 
  geom_point(aes(x = a, y = b, colour = a > 9 & a < 14, size = a > 9 & a < 14)) + 
  scale_colour_manual(values = c("blue", "red")) + 
  scale_size_manual(values =c(1, 4))+
  theme(legend.position = "none")

enter image description here

like image 29
mpalanco Avatar answered Nov 11 '22 14:11

mpalanco


Another solution with gghighlight:

a <- 1:50
b <- rnorm(50)
mydata <- data.frame(a=a,b=b, type = sample(letters, 50, replace = T))

library(gghighlight)
gghighlight_point(mydata, aes(x=a, y=b), label_key = type, 
                  a <= 14 & a >= 10 & b >= 0 , col="red")
like image 5
maxatSOflow Avatar answered Nov 11 '22 15:11

maxatSOflow