Remove unused factor levels from a ggplot bar plot

Question

I want to do the opposite of this question, and sort of the opposite of this question, though that's about legends, not the plot itself.

The other SO questions seem to be asking about how to keep unused factor levels. I'd actually like mine removed. I have several name variables and several columns (wide format) of variable attributes that I'm using to create numerous bar plots. Here's a reproducible example:

library(ggplot2)
df <- data.frame(name=c("A","B","C"), var1=c(1,NA,2),var2=c(3,4,5))
ggplot(df, aes(x=name,y=var1)) + geom_bar()

I get this:

enter image description here

I'd like only the names that have corresponding var_n's show up in my bar plot (as in, there would be no empty space for B).

Reusing the base plot code will be quite easy if I can simply change my output file name and y=var bit. I'd like not have to subset my data frame just to use droplevels on the result for each plot if possible!

Update based on the na.omit() suggestion

Consider a revised data set:

library(ggplot2)
df <- data.frame(name=c("A","B","C"), var1=c(1,NA,2),var2=c(3,4,5), var3=c(NA,6,7))
ggplot(df, aes(x=name,y=var1)) + geom_bar()

I need to use na.omit() for plotting var1 because there's an NA present. But since na.omit makes sure values are present for all columns, the plot removes A as well since it has an NA in var3. This is more analogous to my data. I have 15 total responses with NAs peppered about. I only want to remove factor levels that don't have values for the current plotted y vector, not that have NAs in any vector in the whole data frame.

Gavin Simpson · Accepted Answer

One easy options is to use na.omit() on your data frame df to remove those rows with NA

ggplot(na.omit(df), aes(x=name,y=var1)) + geom_bar()

Given your update, the following

ggplot(df[!is.na(df$var1), ], aes(x=name,y=var1)) + geom_bar()

works OK and only considers NA in Var1. Given that you are only plotting name and Var, apply na.omit() to a data frame containing only those variables

ggplot(na.omit(df[, c("name", "var1")]), aes(x=name,y=var1)) + geom_bar()

Tilo Wiklund · Answer

Notice that, when plotting, you're using only two columns of your data frame, meaning that, rather than passing your whole data.frame you could take the relevant columns x[,c("name", "var1")] apply na.omit to remove the unwanted rows (as Gavin Simpson suggests) na.omit(x[,c("name", "var1")]) and then plot this data.

My R/ggplot is quite rusty, and I realise that there are probably cleaner ways to achieve this.

John-Henry · Answer

A lot of time has passed since this question was originally asked. In 2021 if I was handling this I would use something like:

library(ggplot2)
library(tidyr)
df <- data.frame(name=c("A","B","C"), var1=c(1,NA,2),var2=c(3,4,5))

df %>% 
  drop_na(var1) %>% 
  ggplot(aes(name, var1)) +
  geom_col()

^{Created on 2021-12-03 by the reprex package (v2.0.1)}

Remove unused factor levels from a ggplot bar plot

Tags:

plot

r

ggplot2

factors

Hendy

3 Answers

Gavin Simpson

Tilo Wiklund

John-Henry

Recent Activity

Donate For Us

Remove unused factor levels from a ggplot bar plot

Tags:

plot

r

ggplot2

factors

Hendy

3 Answers

Gavin Simpson

Tilo Wiklund

John-Henry

Related questions

Recent Activity

Donate For Us