Question

I am trying to model some data that follows a sigmoid curve relationship. In my field of work (psychophysics), a Weibull function is usually used to model such relationships, rather than probit.

I am trying to create a model using R and am struggling with syntax. I know that I need to use the vglm() function from the VGAM package, but I am unable to get a sensible model out. Here's my data:

# Data frame example data
dframe1 <- structure(list(independent_variable = c(0.3, 0.24, 0.23, 0.16, 
0.14, 0.05, 0.01, -0.1, -0.2), dependent_variable = c(1, 1, 
1, 0.95, 0.93, 0.65, 0.55, 0.5, 0.5)), .Names = c("independent_variable", 
"dependent_variable"), class = "data.frame", row.names = c(NA, 
-9L))

Here is a plot of the data in dframe1:

library(ggplot2)

# Plot my original data
ggplot(dframe1, aes(independent_variable, dependent_variable)) + geom_point()

enter image description here

This should be able to be modelled by a Weibull function, since the data fit a sigmoid curve relationship. Here is my attempt to model the data and generate a representative plot:

library(VGAM)

# Generate model
my_model <- vglm(formula = dependent_variable ~ independent_variable, family = weibull, data = dframe1)

# Create a new dataframe based on the model, so that it can be plotted
model_dframe <- data.frame(dframe1$independent_variable, fitted(my_model))

# Plot my model fitted data
ggplot(model_dframe, aes(dframe1.independent_variable, fitted.my_model.)) + geom_point()

enter image description here

As you can see, this doesn't represent my original data at all. I'm either generating my model incorrectly, or I'm generating my plot of the model incorrectly. What am I doing wrong?

Note: I have edited this question to make it more understandable; previously I had been using the wrong function entirely (weibreg()). Hence, some of the comments below may not make sense. .....

Was it helpful?

Solution

Here's my solution, with bbmle.

Data:

dframe1 <- structure(list(independent_variable = c(0.3, 0.24, 0.23, 0.16, 
0.14, 0.05, 0.01, -0.1, -0.2), dependent_variable = c(1, 1, 
1, 0.95, 0.93, 0.65, 0.55, 0.5, 0.5)), .Names = c("independent_variable", 
"dependent_variable"), class = "data.frame", row.names = c(NA, 
-9L))

Construct a cumulative Weibull that goes from 0.5 to 1.0 by definition:

wfun <- function(x,shape,scale) {
    (1+pweibull(x,shape,scale))/2.0
}

dframe2 <- transform(dframe1,y=round(40*dependent_variable),x=independent_variable)

Fit a Weibull (log-scale relevant parameters), with binomial variation:

library(bbmle)
m1 <- mle2(y~dbinom(prob=wfun(exp(a+b*x),shape=exp(logshape),scale=1),size=40),
     data=dframe2,start=list(a=0,b=0,logshape=0))

Generate predictions:

pframe <- data.frame(x=seq(-0.2,0.3,length=101))
pframe$y <- predict(m1,pframe)

png("wplot.png")
with(dframe2,plot(y/40~x))
with(pframe,lines(y/40~x,col=2))
dev.off()

enter image description here

OTHER TIPS

OK, I just came across this several months late, but you could also use the mafc.cloglog link from the psyphy package with glm. If the x follows the cloglog then log(x) will follow a weibull psychometric function. The catch as with the above responses is that you need the number of trials for the proportion correct. I just set it to 100 so it would give an integer number of trials but you should fix this to correspond to the numbers that you actually used. Here is the code to do it.

dframe1 <- structure(list(independent_variable = c(0.3, 0.24, 0.23, 0.16, 
0.14, 0.05, 0.01, -0.1, -0.2), dependent_variable = c(1, 1, 
1, 0.95, 0.93, 0.65, 0.55, 0.5, 0.5)), .Names = c("independent_variable", 
"dependent_variable"), class = "data.frame", row.names = c(NA, 
-9L))

library(psyphy)

plot(dependent_variable ~ independent_variable, dframe1)
fit <- glm(dependent_variable ~ exp(independent_variable), 
    binomial(mafc.cloglog(2)), 
    data = dframe1, 
    weights = rep(100, nrow(dframe1)))  # assuming 100 observations per point
xx <- seq(-0.2, 0.3, len = 100)
pred <- predict(fit, newdata = data.frame(independent_variable = xx), type = "response")
lines(xx, pred)

Fit to data

You could also use the drc-package (dose-response-modelling).

I am actually a noob for this kind of models, but perhabs it helps somehow...

Here I fitted a four parameter Weibull, with fixed parameters for the asymptotes (otherwise the upper asymptote would be slightly greater 1, don't know if this is an issue for you). I also had to transform the independent variable (+0.2) so that its is >= 0, because of convergence problems.

require(drc)
# four-parameter Weibull with fixed parameters for the asymptote, added +0.2 to IV to overcome convergence problems
mod <- drm(dependent_variable ~ I(independent_variable+0.2), 
           data = dframe1, 
           fct = W1.4(fixed = c(NA, 0.5, 1, NA)))

# predicts
df2 <- data.frame(pred = predict(mod, newdata = data.frame(idenpendent_variable = seq(0, 0.5, length.out=100))), 
                  x = seq(0, 0.5, length.out=100))

ggplot() + 
  geom_point(data = dframe1, aes(x = independent_variable + 0.2, y = dependent_variable)) +
  geom_line(data = df2, aes(x = x, y = pred))

However I agree with Ben Bolker that other models may be better suited.

I only know these kind of models from ecotoxicology (dose-response-models, where one is interested in the concentration where we have 50% mortality [=EC50]).

enter image description here

Update A four-parameter log-logistic model fits also quite good (smaller AIC and RSE then weibull): Again I fixed here the asymptote parameter and transformed the IV.

# four-parameter log-logistic with fixed parameters for the asymptote, added +0.2 to IV to overcome convergence problems
mod1 <- drm(dependent_variable ~ I(independent_variable+0.2), 
           data = dframe1, 
           fct = LL2.4(fixed=c(NA, 0.5, 1, NA)))
summary(mod1)

# predicts
df2 <- data.frame(pred = predict(mod1, newdata = data.frame(idenpendent_variable = seq(0, 0.5, length.out=100))), 
                  x = seq(0, 0.5, length.out=100))

ggplot() + 
  geom_point(data = dframe1, aes(x = independent_variable + 0.2, y = dependent_variable)) +
  geom_line(data = df2, aes(x = x, y = pred))

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top