Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NA's are being plotted in boxplot ggplot2

I'm trying to plot a v. simple boxplot in ggplot2. I have species richness vs. landuse class. However, I have 2 NA's in my data. For some strange reason, they're being plotted, even when they're being understood as NA's by R. Any suggestion to remove them?

The code I'm using is:

ggplot(data, aes(x=luse, y=rich))+
  geom_boxplot(mapping = NULL, data = NULL, stat = "boxplot", position = "dodge", outlier.colour = "red", outlier.shape = 16, outlier.size = 2, notch = F, notchwidth = 0.5)+
  scale_x_discrete("luse", drop=T)+
  geom_smooth(method="loess",aes(group=1))

However, the graph includes 2 NA's for luse. Unfortunately I cannot post images, but imagine that a NA bar is being added to my graph.

like image 374
R. Solar Avatar asked Jun 17 '13 11:06

R. Solar


People also ask

What do Ggplot Boxplots show?

The boxplot compactly displays the distribution of a continuous variable. It visualises five summary statistics (the median, two hinges and two whiskers), and all "outlying" points individually.

How can you create a boxplot using ggplot2?

In ggplot2, geom_boxplot() is used to create a boxplot. Let us first create a regular boxplot, for that we first have to import all the required libraries and dataset in use. Then simply put all the attributes to plot by in ggplot() function along with geom_boxplot.

How do you add error bars in geom boxplot?

The default box plot in ggplot doesn't add the error bar lines, but you can add them with stat_boxplot , setting geom = "errorbar" . Note that you can change its width with width .

Does Ggplot remove missing values?

ggplot is somewhat more accommodating of missing values than R generally. For those stats which require complete data, missing values will be automatically removed with a warning. If na. rm = TRUE is supplied to the statistic, the warning will be suppressed.


2 Answers

You may try to use the subset() function in the first line of your code

ggplot(data=subset(data, !is.na(luse)), aes(x=luse, y=rich))+

as suggested in: Eliminating NAs from a ggplot

like image 66
בנימן הגלילי Avatar answered Sep 23 '22 12:09

בנימן הגלילי


Here is a formal answer using the comments above to incorporate !is.na() with filter() from tidyverse/dplyr. If you have a basic tidyverse operation such as filtering NAs, you can do it right in the ggplot call, as suggested, to avoid making a new data frame:

ggplot(data %>% filter(!is.na(luse)), aes(x = luse, y = rich)) + geom_boxplot()

like image 32
user29609 Avatar answered Sep 22 '22 12:09

user29609