Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count and add total number of observations (n) to ggplot (geom_point)

Tags:

r

ggplot2

How do I display the total number of observations (n) in a geom_point plot? I know how to include the number by manually adding (e.g.) "n = 1000", but I want to be able to have the number of observations counted automatically for each figure and then displayed somewhere on the figure.

Most of the code I've seen online is for adding n to boxplots (see example below). They don't seem to work for scatter plots (geom_point):

geom_text(aes(label=paste0("N = ", length(disabled)), 
x=length(unique(disabled)), y=max(table(disabled)))) +

This is the code for my figure:

ggplot(scs, aes(x=year, y=disabled, color=unemployed, size=pop)) + 
geom_point(aes(size=pop), alpha = 0.3) +
labs(x = "Year",
    y = "Disabled",
    color = "Unemployed") +
scale_size_continuous("Population size") +
theme(
    axis.title.x = element_text(margin=margin(t=10)),
    panel.background = element_rect(fill=NA),
    legend.title = element_text(size=10),
    legend.key = element_blank())

When I add the geom_point code, it oddly changes the labeling of my size legend.

EDITED:

Thanks for the replies so far. Just to be clear, I don't want n broken down by groups. I want the total number of observations used in the figure.

I don't know how to share my data but this is the output of dput(head(scs, 20)):

> dput(head(scs, 20))
structure(list(
year = c(2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 
    2016, 2017, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013), 
county_name = c("autauga", "autauga", "autauga", "autauga", "autauga", 
    "autauga", "autauga", "autauga", "autauga", "autauga", "autauga", 
    "autauga", "barbour", "barbour", "barbour", "barbour", "barbour", 
    "barbour", "barbour", "barbour"), 
disabled = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 6, 
    6), 
unemployed = c(4, 3, 3, 5, 10, 9, 8, 7, 6, 6, 5, 5, 6, 6, 6, 9, 
    14, 12, 12, 12), 
pop = c(55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036, 
    55036, 55036, 55036, 26201, 26201, 26201, 26201, 26201, 26201, 
    26201, 26201)), 
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", 
    "12", "25", "26", "27", "28", "29", "30", "31", "32"), 
class = "data.frame")
like image 901
trinitysara Avatar asked Oct 29 '25 16:10

trinitysara


1 Answers

Well assuming you mean what you say and all you want is an overall count of the number of rows in scs then that is nrow(scs). You can use paste to add context and make it a string.

I would personally put it in the title, the subtitle, or the caption since scatterplots don't have a natural place to put it like boxplots. But if you want it on the plot figure out the x and y coordinates and add it using annotate.

An example using you data and all of those...

library(tidyverse)
scs <- structure(list(
  year = c(2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015,
           2016, 2017, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013),
  county_name = c("autauga", "autauga", "autauga", "autauga", "autauga",
                  "autauga", "autauga", "autauga", "autauga", "autauga", "autauga",
                  "autauga", "barbour", "barbour", "barbour", "barbour", "barbour",
                  "barbour", "barbour", "barbour"),
  disabled = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 6,
               6),
  unemployed = c(4, 3, 3, 5, 10, 9, 8, 7, 6, 6, 5, 5, 6, 6, 6, 9,
                 14, 12, 12, 12),
  pop = c(55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036,
          55036, 55036, 55036, 26201, 26201, 26201, 26201, 26201, 26201,
          26201, 26201)),
  row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11",
                "12", "25", "26", "27", "28", "29", "30", "31", "32"),
  class = "data.frame")

ggplot(scs, aes(x=year, y=disabled, color=unemployed, size=pop)) +
  geom_point(aes(size=pop), alpha = 0.3) +
  labs(title = paste("Number of observations: ", nrow(scs)),
       subtitle = paste("Number of observations: ", nrow(scs)),
       caption = paste("Number of observations: ", nrow(scs)),
       x = "Year", 
       y = "Disabled",
       color = "Unemployed") +
  scale_size_continuous("Population size") +
  theme(
    axis.title.x = element_text(margin=margin(t=10)),
    panel.background = element_rect(fill=NA),
    legend.title = element_text(size=10),
    legend.key = element_blank()) +
    annotate("text", 
             x = 2012.25, 
             y = 4.5, 
             label = paste("Number of observations: ", nrow(scs)))

Created on 2019-07-22 by the reprex package (v0.3.0)

like image 133
Chuck P Avatar answered Nov 01 '25 07:11

Chuck P



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!