Question

I am working on finalizing a NMDS plot that I created in vegan and ggplot2 but cannot figure out how to add envfit species-loading vectors to the plot. When I try to it says "invalid graphics state".

The example below is slightly modified from another question (Plotting ordiellipse function from vegan package onto NMDS plot created in ggplot2) but it expressed exactly the example I wanted to include since I used this question to help me get metaMDS into ggplot2 in the first place:

library(vegan)
library(ggplot2)
data(dune)

# calculate distance for NMDS
NMDS.log<-log(dune+1)
sol <- metaMDS(NMDS.log)

# Create meta data for grouping
MyMeta = data.frame(
  sites = c(2,13,4,16,6,1,8,5,17,15,10,11,9,18,3,20,14,19,12,7),
  amt = c("hi", "hi", "hi", "md", "lo", "hi", "hi", "lo", "md", "md", "lo", 
      "lo", "hi", "lo", "hi", "md", "md", "lo", "hi", "lo"),
row.names = "sites")

# plot NMDS using basic plot function and color points by "amt" from MyMeta
plot(sol$points, col = MyMeta$amt)

# same in ggplot2
NMDS = data.frame(MDS1 = sol$points[,1], MDS2 = sol$points[,2])
ggplot(data = NMDS, aes(MDS1, MDS2)) + 
  geom_point(aes(data = MyMeta, color = MyMeta$amt))

#Add species loadings
vec.sp<-envfit(sol$points, NMDS.log, perm=1000)
plot(vec.sp, p.max=0.1, col="blue")
Was it helpful?

Solution 2

Start with adding libraries. Additionally library grid is necessary.

library(ggplot2)
library(vegan)
library(grid)
data(dune)

Do metaMDS analysis and save results in data frame.

NMDS.log<-log(dune+1)
sol <- metaMDS(NMDS.log)

NMDS = data.frame(MDS1 = sol$points[,1], MDS2 = sol$points[,2])

Add species loadings and save them as data frame. Directions of arrows cosines are stored in list vectors and matrix arrows. To get coordinates of the arrows those direction values should be multiplied by square root of r2 values that are stored in vectors$r. More straight forward way is to use function scores() as provided in answer of @Gavin Simpson. Then add new column containing species names.

vec.sp<-envfit(sol$points, NMDS.log, perm=1000)
vec.sp.df<-as.data.frame(vec.sp$vectors$arrows*sqrt(vec.sp$vectors$r))
vec.sp.df$species<-rownames(vec.sp.df)

Arrows are added with geom_segment() and species names with geom_text(). For both tasks data frame vec.sp.df is used.

ggplot(data = NMDS, aes(MDS1, MDS2)) + 
  geom_point(aes(data = MyMeta, color = MyMeta$amt))+
  geom_segment(data=vec.sp.df,aes(x=0,xend=MDS1,y=0,yend=MDS2),
      arrow = arrow(length = unit(0.5, "cm")),colour="grey",inherit_aes=FALSE) + 
  geom_text(data=vec.sp.df,aes(x=MDS1,y=MDS2,label=species),size=5)+
  coord_fixed()

enter image description here

OTHER TIPS

The problem with the (otherwise excellent) accepted answer, and which explains why the vectors are all of the same length in the included figure [Note that the accepted Answer has now been edited to scale the arrows in the manner I describe below, to avoid confusion for users coming across the Q&A], is that what is stored in the $vectors$arrows component of the object returned by envfit() are the direction cosines of the fitted vectors. These are all of unit length, and hence the arrows in @Didzis Elferts' plot are all the same length. This is different to the output from plot(envfit(sol, NMDS.log)), and arises because we scale the vector arrow coordinates by the correlation with the ordination configuration ("axes"). That way, species that show a weak relationship with the ordination configuration get shorter arrows. The scaling is done by multiplying the direction cosines by sqrt(r2) where r2 are the values shown in the table of printed output. When adding the vectors to an existing plot, vegan also tries to scale the set of vectors such that they fill the available plot space whilst maintaining the relative lengths of the arrows. How this is done is discussed in the Details section of ?envfit and requires the use of the un-exported function vegan:::ordiArrowMul(result_of_envfit).

Here is a full working example that replicates the behaviour of plot.envfit using ggplot2:

library(vegan)
library(ggplot2)
library(grid)
data(dune)

# calculate distance for NMDS
NMDS.log<-log1p(dune)
set.seed(42)
sol <- metaMDS(NMDS.log)

scrs <- as.data.frame(scores(sol, display = "sites"))
scrs <- cbind(scrs, Group = c("hi","hi","hi","md","lo","hi","hi","lo","md","md",
                              "lo","lo","hi","lo","hi","md","md","lo","hi","lo"))

set.seed(123)
vf <- envfit(sol, NMDS.log, perm = 999)

If we stop at this point and look at vf:

> vf

***VECTORS

             NMDS1       NMDS2     r2 Pr(>r)    
Belper -0.78061195 -0.62501598 0.1942  0.174    
Empnig -0.01315693  0.99991344 0.2501  0.054 .  
Junbuf  0.22941001 -0.97332987 0.1397  0.293    
Junart  0.99999981 -0.00062172 0.3647  0.022 *  
Airpra -0.20995196  0.97771170 0.5376  0.002 ** 
Elepal  0.98959723  0.14386566 0.6634  0.001 ***
Rumace -0.87985767 -0.47523728 0.0948  0.429
.... <truncated>

So the r2 data is used to scale the values in columns NMDS1 and NMDS2. The final plot is produced with:

spp.scrs <- as.data.frame(scores(vf, display = "vectors"))
spp.scrs <- cbind(spp.scrs, Species = rownames(spp.scrs))

p <- ggplot(scrs) +
  geom_point(mapping = aes(x = NMDS1, y = NMDS2, colour = Group)) +
  coord_fixed() + ## need aspect ratio of 1!
  geom_segment(data = spp.scrs,
               aes(x = 0, xend = NMDS1, y = 0, yend = NMDS2),
               arrow = arrow(length = unit(0.25, "cm")), colour = "grey") +
  geom_text(data = spp.scrs, aes(x = NMDS1, y = NMDS2, label = Species),
            size = 3)

This produces:

enter image description here

May i add something late?

Envfit provides pvalues, and sometimes you want to just plot the significant parameters (something vegan can do for you with p.=0.05 in the plot command). I struggled to do that with ggplot2. Here is my solution, maybe you find a more elegant one?

Starting from Didzis' answer from above:

ef<-envfit(sol$points, NMDS.log, perm=1000)
ef.df<-as.data.frame(ef$vectors$arrows*sqrt(ef$vectors$r))
ef.df$species<-rownames(ef.df)

#only significant pvalues
#shortcutting ef$vectors
A <- as.list(ef$vectors)
#creating the dataframe
pvals<-as.data.frame(A$pvals)
arrows<-as.data.frame(A$arrows*sqrt(A$r))
C<-cbind(arrows, pvals)
#subset
Cred<-subset(C,pvals<0.05)
Cred <- cbind(Cred, Species = rownames(Cred))

"Cred "can now be implemented in the geom_segment-argument as discussed above.

Short addition: To get a full representation of the plot.envfit functionality within ggplot2 aka "arrow lengths make full use of plot area" a factor needs to be applied. I don't know if it was intentionally left out in the answers above, as it was even specifically mentioned by Gavin? Just extract the required scaling factor using arrow_factor <- ordiArrowMul(vf) and then you can either apply it to both NMDS columns in spp.scrs or you can do this manually like

arrow_factor <- ordiArrowMul(vf)
spp.scrs <- as.data.frame(scores(vf, display = "vectors")) * arrow_factor
spp.scrs <- cbind(spp.scrs, Species = rownames(spp.scrs), Pvalues = vf$vectors$pvals, R_squared = vf$vectors$r)

# select significance similarly to `plot(vf, p.max = 0.01)`
spp.scrs <- subset(spp.scrs, Pvalues < 0.01)

# you can also add the arrow factor in here (don't do both!)
ggplot(scrs) +
  geom_point(mapping = aes(x = NMDS1, y = NMDS2, colour = Group)) +
  coord_fixed() + ## need aspect ratio of 1!
  geom_segment(data = spp.scrs,
               aes(x = 0, xend = NMDS1 * arrow_factor, y = 0, yend = NMDS2 * arrow_factor),
               arrow = arrow(length = unit(0.25, "cm")), colour = "grey") +
  geom_text(data = spp.scrs, aes(x = NMDS1 * arrow_factor, y = NMDS2 * arrow_factor, label = Species),
            size = 3)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top