Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot: plotting layers only if certain criteria are met

Tags:

r

ggplot2

Is there a method of filtering within ggplot itself? That is, say I want to do this

p <- ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length, species)) +
     geom_point(size = 4, shape = 4) +
     geom_point(size = 1, shape = 5 # do this only for data that meets some condition. E.g. Species == "setosa") 

I know there are hacks I can use like setting the size = 0 if Species != "setosa" or resetting the data like shown below, but there's all hacks.

p <- ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length, species)) +
     geom_point(size = 4, shape = 4) +
     geom_point(data = iris %>% filter(Species == "setosa"), colour = "red") +
     geom_point(data = iris %>% filter(Species == "versicolor"), shape = 5)

Basically, i have a chart where certain things should be displayed only if a certain criteria is met, and right now, I'm using the hack above to accomplish this and it's keeping me up at night, my soul slowly dying from the mess I've created. Needless to say, any help would be very much appreciated!

Edit

I'm afraid my example may have been too simplistic. Basically, given ggplot(data = ...), how do I add these layers, all using the data bound to the ggplot obj:

  1. Plot curves
  2. Plot dots on points that meet criteria #1. These dots would be in red. Points that don't meet the criteria don't get a point drawn (Not a hack like point size set to zero, or alpha set to 0)
  3. Add labels to points that meet criteria #2.

Critera #1 and #2 could be anything. E.g. label only outlier points. Draw in red only those points which are outside a specific range, etc.

I don't want to

  1. bind a new dataset ala ggplot(data=subset(iris, Species=="setosa"),...) or ggplot(data=filter(iris,Species=="setosa").
  2. use a scaling hack (like setting scale=manual and whatever doesn't meet the criteria gets a NULL/NA, etc). For example, if I had 1000 points and only 1 point met a given criteria, I want it to only apply it's plotting logic to that one point instead of looking at, and styling all 1000 points
like image 316
adilapapaya Avatar asked Mar 04 '16 21:03

adilapapaya


2 Answers

apparently layers now accept a function as data argument, so you could use that

pick <- function(condition){
  function(d) d %>% filter_(condition)
}

ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length, species)) +
  geom_point(size = 4, shape = 4) +
  geom_point(data = pick(~Species == "setosa"), colour = "red") +
  geom_point(data = pick(~Species == "versicolor"), shape = 5)
like image 150
baptiste Avatar answered Oct 09 '22 08:10

baptiste


You can filter data with an anonymous function using the ~ formula notation:

library(ggplot2)
library(dplyr)

ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length, species)) +
    geom_point(size = 4, shape = 4) +
    geom_point(data = ~filter(.x, Species == "setosa"), colour = "red") +
    geom_point(data = ~filter(.x, Species == "versicolor"), shape = 5)

Created on 2021-11-15 by the reprex package (v2.0.0)

like image 27
JohannesNE Avatar answered Oct 09 '22 06:10

JohannesNE