Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Coloring ggplot2 axis tick labels based on data displayed at axis tick positions

Tags:

r

ggplot2

I have this dataset:

top_bot_5_both <- structure(list(
  name = structure(c(20L, 19L, 18L, 17L, 16L, 15L, 14L, 13L, 12L, 
                     11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), 
                   .Label = c("Michele Bachmann", "Donald Trump", "Ted Cruz", 
                              "Newt Gingrich", "Rick Santorum", "Terry McAuliffe", 
                              "Nancy Pelosi", "Debbie Wasserman Schultz", 
                              "Tammy Baldwin", "Joe Biden", "Rand Paul", "Jeb Bush", 
                              "John Kasich", "Barack Obama", "Bill Clinton", 
                              "Hillary Clinton", "Nathan Deal", "Tim Kaine", 
                              "Rob Portman", "Sherrod Brown"), 
                   class = "factor"),
  Party = c("Democratic", "Republican", "Democratic", "Republican", "Democratic", 
            "Democratic", "Democratic", "Republican", "Republican", "Republican", 
            "Republican", "Republican", "Republican", "Republican", "Republican", 
            "Democratic", "Democratic", "Democratic", "Democratic", "Democratic"), 
  total_ratings = c(35L, 48L, 51L, 49L, 296L, 41L, 599L, 64L, 80L, 55L, 
                    61L, 472L, 123L, 82L, 61L, 31L, 35L, 48L, 33L, 75L), 
  sum = c(22, 29, 21, 18, 96, 12, 172, 16, 18, 2, -86, -525, -94, -57, 
          -42, -19, -14, -7, -4, -1), 
  score = c(0.628571428571429, 0.604166666666667, 0.411764705882353, 
            0.36734693877551, 0.324324324324324, 0.292682926829268, 
            0.287145242070117, 0.25, 0.225, 0.0363636363636364, -1.40983606557377, 
            -1.11228813559322, -0.764227642276423, -0.695121951219512, 
            -0.688524590163934, -0.612903225806452, -0.4, -0.145833333333333, 
            -0.121212121212121, -0.0133333333333333)), 
  class = c("tbl_df", "tbl", "data.frame"), 
  row.names = c(NA, -20L), 
  .Names = c("name", "Party", "total_ratings", "sum", "score"))

I want to make a column graph that has the y-axis text mirror the fill color in the graph itself. Following the suggestion from this thread, I added the column color to my tibble

top_bot_5_both <- top_bot_5_both %>%
  mutate(color = ifelse(Party == "Democratic", "#1A80C4", "#CC3D3D"))

And then graphed it as such:

ggplot(top_bot_5_both, aes(x = name, y = score, fill = Party)) +
  geom_col(size = 1, color = "black") +
  coord_flip() +
  geom_vline(xintercept = 10.5, size = 1.5, linetype = "twodash") +
  scale_fill_manual(values = c("Democratic" = "#1A80C4", "Republican" = "#CC3D3D")) +
  theme(axis.text = element_text(size = 12),
    axis.text.y = element_text(color = top_bot_5_both$color),
    axis.title = element_text(size = 14),
    legend.position = "bottom", 
    legend.title = element_blank())

enter image description here

In the graph, the fill correctly identifies each politician's party but the axis.text.y does not. For instance, Michelle Bachmann is a Republican, so her column should have red fill (which it does) and should have red axis text (which it does not).

I then decided to wrap it in rev() by using axis.text.y = element_text(color = rev(top_bot_5_both$color) which created this graph, which is closer to correct.

ggplot(top_bot_5_both, aes(x = name, y = score, fill = Party)) +
  geom_col(size = 1, color = "black") +
  coord_flip() +
  geom_vline(xintercept = 10.5, size = 1.5, linetype = "twodash") +
  scale_fill_manual(values = c("Democratic" = "#1A80C4", "Republican" = "#CC3D3D")) +
  theme(axis.text = element_text(size = 12),
    axis.text.y = element_text(color = rev(top_bot_5_both$color)),
    axis.title = element_text(size = 14),
    legend.position = "bottom", 
    legend.title = element_blank())

enter image description here

The top_bot_5_both$color vector is in the correct order in the tibble, but it somehow gets messed up during the actual graphing. Maybe it has to do with top_bot_5_both$name being a factor, but that is necessary I believe in order to graph the scores in descending order.

Other things I tried that did not work:

  • Getting rid of coord_flip() and changing y to x, etc.
  • Not generating the column top_bot_5_both$color and instead putting the ifelse(...) command inside element_text(color = ...)
  • And I can't do this manually because I plan to repeat this analysis in the future and the politicians' order on the y-axis will likely change

Am I doing something wrong or is there a bug in here somewhere?

like image 859
jryan14ify Avatar asked Dec 21 '17 23:12

jryan14ify


1 Answers

It's important to realize that when you're writing axis.text.y = element_text(color = ...), you're not mapping data to colors using the regular ggplot2 mapping mechanism, you're manually assigning colors to specific axis labels. Therefore, the colors you assign need to be exactly in the right order.

When you inspect your data frame, you can see that the rows are not in the order in which they are plotted; the order is set by the levels of the name factor. Therefore, you have to use that order for your colors as well. You can do that using the order() function:

colors <- top_bot_5_both$color[order(top_bot_5_both$name)]

Now, with this color vector, we can make the plot as intended:

ggplot(top_bot_5_both, aes(x = name, y = score, fill = Party)) +
  geom_col(size = 1, color = "black") +
  coord_flip() +
  geom_vline(xintercept = 10.5, size = 1.5, linetype = "twodash") +
  scale_fill_manual(values = c("Democratic" = "#1A80C4", "Republican" = "#CC3D3D")) +
  theme(axis.text = element_text(size = 12),
        axis.text.y = element_text(color = colors),
        axis.title = element_text(size = 14),
        legend.position = "bottom", 
        legend.title = element_blank())

enter image description here

like image 132
Claus Wilke Avatar answered Sep 29 '22 13:09

Claus Wilke