Question

How can I plot a fanchart as displayed on this Wikipedia page?

I have installed the nlme package with its MathAchieve dataset, but I cannot find the commands for plotting this graph.

The nlme pdf file is here.

I also checked this link but it is non-english.

With the fan.plot function from the plotrix package, I could only draw pie charts: https://sites.google.com/site/distantyetneversoclose/excel-charts/the-pie-doughnut-combination-a-fan-plot

Thanks for your help.

Was it helpful?

Solution

Having thought about this a bit more since my previous answer, I've come up with a simpler way of producing multipanel (if appropriate) fanplots, overlaid on a levelplot, as shown in the Wikipedia Fan chart page. This approach works with a data.frame that has two independent variables and zero or more conditioning variables that separate data into panels.

First we define a new panel function, panel.fanplot.

panel.fanplot <- function(x, y, z, zmin, zmax, subscripts, groups, 
                          nmax=max(tapply(z, list(x, y, groups), 
                            function(x) sum(!is.na(x))), na.rm=T), 
                          ...) {

  if(missing(zmin)) zmin <- min(z, na.rm=TRUE)
  if(missing(zmin)) zmax <- max(z, na.rm=TRUE)
  get.coords <- function(a, d, x0, y0) {
    a <- ifelse(a <= 90, 90 - a, 450 - a)
    data.frame(x = x0 + d * cos(a / 180 * pi), 
               y = y0 + d * sin(a / 180 * pi))
  }

  z.scld <- (z - zmin)/(zmax - zmin) * 360
  fan <- aggregate(list(z=z.scld[subscripts]), 
                   list(x=x[subscripts], y=y[subscripts]), 
                   function(x) 
                     c(n=sum(!is.na(x)),
                       quantile(x, c(0.25, 0.5, 0.75), na.rm=TRUE) - 90))

  panel.levelplot(fan$x, fan$y, 
                  (fan$z[, '50%'] + 90) / 360 * (zmax - zmin) + zmin,
                  subscripts=seq_along(fan$x), ...)
  lapply(which(!is.na(fan$z[, '50%'])), function(i) {
    with(fan[i, ], {
      poly <- rbind(c(x, y), 
                    get.coords(seq(z[, '25%'], z[, '75%'], length.out=200), 
                               0.3, x, y))
      lpolygon(poly$x, poly$y, col='gray10', border='gray10', lwd=3)
      llines(get.coords(c(z[, '50%'], 180 + z[, '50%']), 0.3, x, y),
             col='black', lwd=3, lend=1)
      llines(get.coords(z[, '50%'], c(0.3, (1 - z[, 'n']/nmax) * 0.3), x, y), 
             col='white', lwd=3)
    })
  })
}

Now we create some dummy data and call levelplot:

d <- data.frame(z=runif(1000), 
                x=sample(5, 1000, replace=TRUE),
                y=sample(5, 1000, replace=TRUE),
                grp=sample(4, 1000, replace=TRUE))

colramp <- colorRampPalette(c('#fff495', '#bbffaa', '#70ffeb', '#72aaff', 
                              '#bf80ff'))

levelplot(z ~ x*y|as.factor(grp), d, groups=grp, asp=1, col.regions=colramp, 
          panel=panel.fanplot, zmin=min(d$z, na.rm=TRUE), 
          zmax=max(d$z, na.rm=TRUE), at=seq(0, 1, 0.2))

It's important to pass the conditioning variable (that separates plots into panels) to levelplot via the argument group, as shown above with the variable grp, in order for sample sizes to be calculated (shown by white line length).

fanplot1

And here's how we would mimic the Wikipedia plot:

library(nlme)
data(MathAchieve)
MathAchieve$SESfac <- as.numeric(cut(MathAchieve$SES, seq(-2.5, 2, 0.5)))
MathAchieve$MEANSESfac <- 
  as.numeric(cut(MathAchieve$MEANSES, seq(-1.25, 1, 0.25)))
levels(MathAchieve$Minority) <- c('Non-minority', 'Minority')
MathAchieve$group <- 
  as.factor(paste0(MathAchieve$Sex, ', ', MathAchieve$Minority))

colramp <- colorRampPalette(c('#fff495', '#bbffaa', '#70ffeb', '#72aaff', 
                              '#bf80ff'))

levelplot(MathAch ~ SESfac*MEANSESfac|group, MathAchieve, 
          groups=group, asp=1, col.regions=colramp, 
          panel=panel.fanplot, zmin=0, zmax=28, at=seq(0, 25, 5),
          scales=list(alternating=1, 
                      tck=c(1, 0), 
                      x=list(at=seq(1, 11) - 0.5, 
                             labels=seq(-2.5, 2, 0.5)),
                      y=list(at=seq(1, 11) - 0.5, 
                             labels=seq(-1.25, 1, 0.25))),
          between=list(x=1, y=1), strip=strip.custom(bg='gray'),
          xlab='Socio-economic status of students',
          ylab='Mean socio-economic status for school')

fanplot2

OTHER TIPS

I can think of a couple of ways of going about this with lattice. You could either use xyplot and fill panels with panel.fill, or you can use levelplot. The polygons themselves have to be added with a custom panel and lpolygon. Here's how I've done it with levelplot. I'm really a lattice novice, though, and there may very well be some shortcuts that I don't know about.

Because I'm using levelplot, we first create a matrix containing median MathAch scores for each combination of MEANSES and SES. These will be used to plot the cell colours.

library(lattice)
library(nlme)
data(MathAchieve)

Below, I convert SES and MEANSES into factors using cut, with breakpoints as in the Wikipedia example.

MathAchieve$SESfac <- as.numeric(cut(MathAchieve$SES, seq(-2.5, 2, 0.5)))
MathAchieve$MEANSESfac <- as.numeric(cut(MathAchieve$MEANSES, 
                                         seq(-1.25, 1, 0.25)))

I'm not sure how to plot the four panels as on the Wikipedia page, so I'll just subset to non-minority females:

d <- subset(MathAchieve, Sex=='Female' & Minority=='No')

To convert this dataframe to a matrix, I split it to a list and then coerce back to a matrix with the appropriate dimensions. Each cell of the matrix contains the median MathAch for a particular combination of SESfac and MEANSESfac.

l <- split(d$MathAch, list(d$SESfac, d$MEANSESfac))
m.median <- matrix(sapply(l, median), ncol=9)

When we use levelplot we will have access to x and y, being the coordinates of the "current" cell. In order to pass the vector of MathAch to levelplot, so that a polygon can be plotted for each cell, I create a matrix (same dimensions as m.median) of lists, where each cell is a list containing a MathAch vector.

m <- matrix(l, ncol=9)

Below we create a color ramp as used by Wolfram Fischer in the example on Wikipedia.

colramp <- colorRampPalette(c('#fff495', '#bbffaa', '#70ffeb', '#72aaff', 
                              '#bf80ff'))

Now we define the custom panel function. I've commented throughout to explain:

fanplot <- function(x, y, z, subscripts, fans, ymin, ymax, 
                    nmax=max(sapply(fans, length)), ...) {
  # nmax is the maximum sample size across all combinations of conditioning
  # variables. For generality, ymin and ymax are limits of the circle around 
  # around which fancharts are plotted. 
  # fans is our matrix of lists, which are used to plot polygons.
  get.coords <- function(a, d, x0, y0) {
    a <- ifelse(a <= 90, 90 - a, 450 - a)
    data.frame(x = x0 + d * cos(a / 180 * pi), 
               y = y0 + d * sin(a / 180 * pi))
  }
  # getcoords returns coordinates of one or more points, given angle(s), 
  # (i.e., a), distances (i.e., d), and an origin (x0 and y0).

  panel.levelplot(x, y, z, subscripts, ...)

  # Below, we scale the raw vectors of data such that ymin thru ymax map to 
  # 0 thru 360. We then calculate the relevant quantiles (i.e. 25%, 50% and 75%).
  smry <- lapply(fans, function(y) {
    y.scld <- (y - ymin)/(ymax - ymin) * 360
    quantile(y.scld, c(0.25, 0.5, 0.75)) - 90
  })

  # Now we use get.coords to determine relevant coordinates for plotting 
  # polygons and lines. We plot a white line inwards from the circle's edge,
  # with length according to the ratio of the sample size to nmax.
  mapply(function(x, y, smry, n) {
    if(!any(is.na(smry))) {
      lpolygon(rbind(c(x, y), 
                     get.coords(seq(smry['25%'], smry['75%'], length.out=200), 
                                0.3, x, y)), col='gray10', lwd=2)
      llines(get.coords(c(smry['50%'], 180 + smry['50%']), 0.3, 
                        x, y), col=1, lwd=3)
      llines(get.coords(smry['50%'], c(0.3, (1 - n/nmax) * 0.3), 
                        x, y), col='white', lwd=3)
    }
  }, x=x, y=y, smry=smry, n=sapply(fans, length))
}

And finally use this custom panel function within levelplot:

levelplot(m.median, fans=m, ymin=0, ymax=28,
          col.regions=colramp, at=seq(0, 25, 5), panel=fanplot, 
          scales=list(tck=c(1, 0), 
                      x=list(at=seq_len(ncol(m.median) + 1) - 0.5, 
                             labels=seq(-2.5, 2, 0.5)),
                      y=list(at=seq_len(nrow(m.median) + 1) - 0.5, 
                             labels=seq(-1.25, 1, 0.25))), 
          xlab='Socio-economic status of students',
          ylab='Mean socio-economic status for the school')

enter image description here

I haven't coloured cells grey if they have sample size < 7, as was done for the equivalent plot on the Wikipedia page, but this could be done with lrect if needed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top