This may be a wish list thing, not sure (i.e. maybe there would need to be the creation of geom_pie
for this to occur). I saw a map today (LINK) with pie graphs on it as seen here.
I don't want to debate the merits of a pie graph, this was more of an exercise of can I do this in ggplot?
I have provided a data set below (loaded from my drop box) that has the mapping data to make a New York State map and some purely fabricated data on racial percentages by county. I have given this racial make up as a merge with the main data set and as a separate data set called key. I also think Bryan Goodrich's response to me in another post (HERE) on centering county names will be helpful to this concept.
How can we make the map above with ggplot2?
A data set and the map without the pie graphs:
load(url("http://dl.dropbox.com/u/61803503/nycounty.RData")) head(ny); head(key) #view the data set from my drop box library(ggplot2) ggplot(ny, aes(long, lat, group=group)) + geom_polygon(colour='black', fill=NA) # Now how can we plot a pie chart of race on each county # (sizing of the pie would also be controllable via a size # parameter like other `geom_` functions).
Thanks in advance for your ideas.
EDIT: I just saw another case at junkcharts that screams for this type of capability:
A pie chart in ggplot is a bar plot plus a polar coordinate. You can use geom_bar or geom_col and theta = "y" inside coord_polar . The borders of the pie can be changed with the color argument of the geom_bar or geom_col function.
Three years later this is solved. I've put together a number of processes together and thanks to @Guangchuang Yu's excellent ggtree package this can be done fairly easily. Note that as of (9/3/2015) you need to have version 1.0.18 of ggtree installed but these will eventually trickle down to their respective repositories.
I've used the following resources to make this (the links will give greater detail):
Here's the code:
load(url("http://dl.dropbox.com/u/61803503/nycounty.RData")) head(ny); head(key) #view the data set from my drop box if (!require("pacman")) install.packages("pacman") p_load(ggplot2, ggtree, dplyr, tidyr, sp, maps, pipeR, grid, XML, gtable) getLabelPoint <- function(county) {Polygon(county[c('long', 'lat')])@labpt} df <- map_data('county', 'new york') # NY region county data centroids <- by(df, df$subregion, getLabelPoint) # Returns list centroids <- do.call("rbind.data.frame", centroids) # Convert to Data Frame names(centroids) <- c('long', 'lat') # Appropriate Header pops <- "http://data.newsday.com/long-island/data/census/county-population-estimates-2012/" %>% readHTMLTable(which=1) %>% tbl_df() %>% select(1:2) %>% setNames(c("region", "population")) %>% mutate( population = {as.numeric(gsub("\\D", "", population))}, region = tolower(gsub("\\s+[Cc]ounty|\\.", "", region)), #weight = ((1 - (1/(1 + exp(population/sum(population)))))/11) weight = exp(population/sum(population)), weight = sqrt(weight/sum(weight))/3 ) race_data_long <- add_rownames(centroids, "region") %>>% left_join({distinct(select(ny, region:other))}) %>>% left_join(pops) %>>% (~ race_data) %>>% gather(race, prop, white:other) %>% split(., .$region) pies <- setNames(lapply(1:length(race_data_long), function(i){ ggplot(race_data_long[[i]], aes(x=1, prop, fill=race)) + geom_bar(stat="identity", width=1) + coord_polar(theta="y") + theme_tree() + xlab(NULL) + ylab(NULL) + theme_transparent() + theme(plot.margin=unit(c(0,0,0,0),"mm")) }), names(race_data_long)) e1 <- ggplot(race_data_long[[1]], aes(x=1, prop, fill=race)) + geom_bar(stat="identity", width=1) + coord_polar(theta="y") leg1 <- gtable_filter(ggplot_gtable(ggplot_build(e1)), "guide-box") p <- ggplot(ny, aes(long, lat, group=group)) + geom_polygon(colour='black', fill=NA) + theme_bw() + annotation_custom(grob = leg1, xmin = -77.5, xmax = -78.5, ymin = 44, ymax = 45) n <- length(pies) for (i in 1:n) { nms <- names(pies)[i] dat <- race_data[which(race_data$region == nms)[1], ] p <- subview(p, pies[[i]], x=unlist(dat[["long"]])[1], y=unlist(dat[["lat"]])[1], dat[["weight"]], dat[["weight"]]) } print(p)
This functionality should be in ggplot, I think it is coming to ggplot soonish, but it is currently available in base plots. I thought I would post this just for comparison's sake.
load(url("http://dl.dropbox.com/u/61803503/nycounty.RData")) library(plotrix) e=10^-5 myglyff=function(gi) { floating.pie(mean(gi$long), mean(gi$lat), x=c(gi[1,"white"]+e, gi[1,"black"]+e, gi[1,"hispanic"]+e, gi[1,"asian"]+e, gi[1,"other"]+e), radius=.1) #insert size variable here } g1=ny[which(ny$group==1),] plot(g1$long, g1$lat, type='l', xlim=c(-80,-71.5), ylim=c(40.5,45.1)) myglyff(g1) for(i in 2:62) {gi=ny[which(ny$group==i),] lines(gi$long,gi$lat) myglyff(gi) }
Also, there may be (probably are) more elegant ways of doing this in the base graphics.
As, you can see, there are quite a few problems with this that need to be solved. A fill color for the counties. The pie charts tend to be too small or overlap. The lat and long do not take a projection so sizes of counties are distorted.
In any event, I am interested in what others can come up with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With