Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting barchart on States map in R

Tags:

dictionary

plot

r

Following is USA states example data from state.x77 dataset in r:

mydf = structure(list(usa_state = structure(1:5, .Label = c("Alabama", 
"Alaska", "Arizona", "Arkansas", "California"), class = "factor"), 
    Life.Exp = c(69.05, 69.31, 70.55, 70.66, 71.71), HS.Grad = c(41.3, 
    66.7, 58.1, 39.9, 62.6)), .Names = c("usa_state", "Life.Exp", 
"HS.Grad"), class = "data.frame", row.names = c(NA, -5L))

> mydf
   usa_state Life.Exp HS.Grad
1    Alabama    69.05    41.3
2     Alaska    69.31    66.7
3    Arizona    70.55    58.1
4   Arkansas    70.66    39.9
5 California    71.71    62.6
> 

I want to plot it on USA states map. I can plot the map using following code:

all_states <- map_data("state")
ggplot() + geom_polygon( data=all_states, aes(x=long, y=lat, group = group),colour="gray", fill="white" )

enter image description here

But I am not able to plot barcharts over the map. Thanks for your help.

like image 396
rnso Avatar asked Nov 11 '22 05:11

rnso


1 Answers

I drew on two great sources to answer this:

  • http://www.r-bloggers.com/us-state-maps-using-map_data/. Especially see the gist https://gist.github.com/4252133 and comments at end from @amandamasonsingh (although the gists demo prison dataset is not available)
  • ggplot/mapping US counties — problems with visualization shapes in R

SOLUTION

mydf <- structure(list(usa_state = 
                         structure(1:5,
                                   .Label = c("Alabama", "Alaska", "Arizona", "Arkansas", "California"), class = "factor"), 
                      Life.Exp = c(69.05, 69.31, 70.55, 70.66, 71.71), 
                      HS.Grad = c(41.3, 66.7, 58.1, 39.9, 62.6)),
                 .Names = c("usa_state", "Life.Exp", "HS.Grad"), 
                 class = "data.frame", row.names = c(NA, -5L))

library(ggplot2)
library(maps)
library(RColorBrewer) # lots of color palettes for these kind of charts

library(data.table) # for sorting by key
library(mapproj) #coord_maps() needed this

all_states <- map_data("state")

# You need to merge dataset with maps one with long and lat.
# But you need same key so lets change state to region used in maps all_states
# Note I lowercased it to get the match

mydf$region <- tolower(mydf$usa_state)
totaldf <- merge(all_states, mydf, by = "region")

# switched to data.table to fix the cut up map issue
# getting sort by region then order 
totaldt <- as.data.table(totaldf)
setkey(totaldt, region, order)

ggplot(data = totaldt, 
       aes(x = long, y = lat, group = group, fill = HS.Grad)) +
  geom_polygon() + coord_map() +
  scale_fill_gradientn("", colours=brewer.pal(9, "GnBu"))

DONT FORGET TO SORT ME

If your data is not sorted correctly by region and then by order, then you will get patchy maps like this.

sliced up map

I use data.table package and key the data. Also data.table is much faster if you need to merge lots of data. You use format X[Y] for this. See data.table cheatsheet if you are new to this package.

FINAL MAP

This is for the HS.Grid in your example. Get your other chart by changing the fill = myvariable

map example for HS.grid

Note not all the states are shown, because the test data is limited to these states. In a fuller example you will see more states.

Also you will see Alaska is missing. Its not in the maps - see this answer from @jazzurro for practical tests on state names with setdiff.

like image 90
micstr Avatar answered Dec 08 '22 01:12

micstr