Modelling data with a Weibull link function in R

Question 1

Here's my solution, with bbmle.

Data:

dframe1 <- structure(list(independent_variable = c(0.3, 0.24, 0.23, 0.16, 
0.14, 0.05, 0.01, -0.1, -0.2), dependent_variable = c(1, 1, 
1, 0.95, 0.93, 0.65, 0.55, 0.5, 0.5)), .Names = c("independent_variable", 
"dependent_variable"), class = "data.frame", row.names = c(NA, 
-9L))

Construct a cumulative Weibull that goes from 0.5 to 1.0 by definition:

wfun <- function(x,shape,scale) {
    (1+pweibull(x,shape,scale))/2.0
}

dframe2 <- transform(dframe1,y=round(40*dependent_variable),x=independent_variable)

Fit a Weibull (log-scale relevant parameters), with binomial variation:

library(bbmle)
m1 <- mle2(y~dbinom(prob=wfun(exp(a+b*x),shape=exp(logshape),scale=1),size=40),
     data=dframe2,start=list(a=0,b=0,logshape=0))

Generate predictions:

pframe <- data.frame(x=seq(-0.2,0.3,length=101))
pframe$y <- predict(m1,pframe)

png("wplot.png")
with(dframe2,plot(y/40~x))
with(pframe,lines(y/40~x,col=2))
dev.off()

enter image description here

Question 2

OK, I just came across this several months late, but you could also use the mafc.cloglog link from the psyphy package with glm. If the x follows the cloglog then log(x) will follow a weibull psychometric function. The catch as with the above responses is that you need the number of trials for the proportion correct. I just set it to 100 so it would give an integer number of trials but you should fix this to correspond to the numbers that you actually used. Here is the code to do it.

dframe1 <- structure(list(independent_variable = c(0.3, 0.24, 0.23, 0.16, 
0.14, 0.05, 0.01, -0.1, -0.2), dependent_variable = c(1, 1, 
1, 0.95, 0.93, 0.65, 0.55, 0.5, 0.5)), .Names = c("independent_variable", 
"dependent_variable"), class = "data.frame", row.names = c(NA, 
-9L))

library(psyphy)

plot(dependent_variable ~ independent_variable, dframe1)
fit <- glm(dependent_variable ~ exp(independent_variable), 
    binomial(mafc.cloglog(2)), 
    data = dframe1, 
    weights = rep(100, nrow(dframe1)))  # assuming 100 observations per point
xx <- seq(-0.2, 0.3, len = 100)
pred <- predict(fit, newdata = data.frame(independent_variable = xx), type = "response")
lines(xx, pred)

Fit to data

Question 3

You could also use the drc-package (dose-response-modelling).

I am actually a noob for this kind of models, but perhabs it helps somehow...

Here I fitted a four parameter Weibull, with fixed parameters for the asymptotes (otherwise the upper asymptote would be slightly greater 1, don't know if this is an issue for you). I also had to transform the independent variable (+0.2) so that its is >= 0, because of convergence problems.

require(drc)
# four-parameter Weibull with fixed parameters for the asymptote, added +0.2 to IV to overcome convergence problems
mod <- drm(dependent_variable ~ I(independent_variable+0.2), 
           data = dframe1, 
           fct = W1.4(fixed = c(NA, 0.5, 1, NA)))

# predicts
df2 <- data.frame(pred = predict(mod, newdata = data.frame(idenpendent_variable = seq(0, 0.5, length.out=100))), 
                  x = seq(0, 0.5, length.out=100))

ggplot() + 
  geom_point(data = dframe1, aes(x = independent_variable + 0.2, y = dependent_variable)) +
  geom_line(data = df2, aes(x = x, y = pred))

However I agree with Ben Bolker that other models may be better suited.

I only know these kind of models from ecotoxicology (dose-response-models, where one is interested in the concentration where we have 50% mortality [=EC50]).

enter image description here

Update A four-parameter log-logistic model fits also quite good (smaller AIC and RSE then weibull): Again I fixed here the asymptote parameter and transformed the IV.

# four-parameter log-logistic with fixed parameters for the asymptote, added +0.2 to IV to overcome convergence problems
mod1 <- drm(dependent_variable ~ I(independent_variable+0.2), 
           data = dframe1, 
           fct = LL2.4(fixed=c(NA, 0.5, 1, NA)))
summary(mod1)

# predicts
df2 <- data.frame(pred = predict(mod1, newdata = data.frame(idenpendent_variable = seq(0, 0.5, length.out=100))), 
                  x = seq(0, 0.5, length.out=100))

ggplot() + 
  geom_point(data = dframe1, aes(x = independent_variable + 0.2, y = dependent_variable)) +
  geom_line(data = df2, aes(x = x, y = pred))

enter image description here