Question

I am trying to plot some information that shows full population and then a subset of that population by location on a map. I've seen data visualizations that use concentric circles or 3-d inverted cones to convey this. I just can't figure out how to do it in ggplot / ggmap

Here's a free hand version in Paint that shows a rough idea of what I'm looking to do: enter image description here

Here's a rough piece of data for an example:

> dput(df1)
structure(list(zip = c("00210", "00653", "00952", "02571", "04211", 
"05286", "06478", "07839", "10090", "11559"), city = c("Portsmouth", 
"Guanica", "Sabana Seca", "Wareham", "Auburn", "Craftsbury", 
"Oxford", "Greendell", "New York", "Lawrence"), state = c("NH", 
"PR", "PR", "MA", "ME", "VT", "CT", "NJ", "NY", "NY"), latitude = c(43.005895, 
17.992112, 18.429218, 41.751554, 44.197009, 44.627698, 41.428163, 
41.12831, 40.780751, 40.61579), longitude = c(-71.013202, -66.90097, 
-66.18014, -70.71059, -70.239485, -72.434398, -73.12729, -74.678956, 
-73.977182, -73.73126), timezone = c(-5L, -4L, -4L, -5L, -5L, 
-5L, -5L, -5L, -5L, -5L), dst = c(TRUE, FALSE, FALSE, TRUE, TRUE, 
TRUE, TRUE, TRUE, TRUE, TRUE), totalPop = c(43177, 37224, 37168, 
15492, 1614, 88802, 2587, 80043, 78580, 87461), subPop = c(42705, 
36926, 27556, 10827, 774, 39060, 1542, 21304, 53438, 2896)), .Names = c("zip", 
"city", "state", "latitude", "longitude", "timezone", "dst", 
"totalPop", "subPop"), row.names = c(1L, 50L, 200L, 900L, 1500L, 
2000L, 2500L, 3000L, 3500L, 4000L), class = "data.frame")

Any suggestions?

Was it helpful?

Solution

The basic idea is to use separate geoms for the two populations, making sure the smaller one is plotted after the larger one, so its layer is on top:

library(ggplot2) # using version 0.9.2.1
library(maps)

# load us map data
all_states <- map_data("state")

# start a ggplot. it won't plot til we type p
p <- ggplot()  

# add U.S. states outlines to ggplot
p <- p + geom_polygon(data=all_states, aes(x=long, y=lat, group = group),
     colour="grey", fill="white" )

# add total Population
p <- p + geom_point(data=df1, aes(x=longitude, y=latitude, size = totalPop), 
     colour="#b5e521")

# add sub Population as separate layer with smaller points at same long,lat
p <- p + geom_point(data=df1, aes(x=longitude, y=latitude, size = subPop), 
     colour="#00a3e8")

# change name of legend to generic word "Population"
p <- p + guides(size=guide_legend(title="Population"))

# display plot
p 

enter image description here

From the map, it is clear your data include non-contiguous-US locations, in which case you may want different underlying map data. get_map() from ggmap package provides a couple options:

require(ggmap)
require(mapproj)
map <- get_map(location = 'united states', zoom = 3, maptype = "terrain", 
       source = "google")
p <- ggmap(map)

After which you add the total and sub Population geom_point() layers and display it as before.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top