Question

I'm working on mapping deaths due to road traffic collisions in each country. I pulled the data from the WHO using this code:

    library(XML)
    col <- "http://apps.who.int/gho/athena/data/GHO/RS_196,RS_198.html?profile=ztable&filter=COUNTRY:*" 
    col.doc <- htmlParse(col)
    col.tabs <- readHTMLTable(col.doc)
    colDF <- as.data.frame(col.tabs)
    colDF$Country <- colDF$NULL.COUNTRY

    colDeathTot <- colDF[seq(1, nrow(colDF), 2), ]
    colDeathTot$TotalDeaths <- colDeathTot$NULL.NUMERIC.VALUE

Then I map the data using "gvisGeoChart."

    install.packages("googleVis")
    library(googleVis)

    WorldCollisions <- gvisGeoChart(colDeathTot, 
        locationvar="NULL.COUNTRY", colorvar="TotalDeaths", 
        options=list(displayMode="regions"), 
        chartid="GeoChart_RoadDeaths_World")
    plot(WorldCollisions)

The problem is that the data on the map is incorrect. For example for Canada I get 126 on the map when the dataframe is 2296. Any thoughts on this? I thought maybe the data was coming from the "row.names" variable but that's not it. Maybe the countries aren't matching correctly?

Was it helpful?

Solution

Your columns all end up as FACTOR variables (i.e. Canada is 126 by coincidence). Try:

str(colDeathTot)

To overcome this I changed

colDeathTot$TotalDeaths <- colDeathTot$NULL.NUMERIC.VALUE

to

colDeathTot$TotalDeaths <- as.numeric(as.character(colDeathTot$NULL.NUMERIC.VALUE))

and it seems to work. As these are absolut numbers for road death China comes up with 275983 casualties in 2010, followed by India. Putting the numbers in relation to population size would be a good idea to enhance the statement.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top