Question

To fit neural network to a dataset using R function nnet, I learned that when the cases are unevenly distributed across classes, I should weights each case properly (http://cowlet.org/2014/01/12/understanding-data-science-classification-with-neural-networks-in-r.html).

R function nnet has a "weights" input, and I would like to know how exactly this is doing. The help file only says "(case) weights for each example – if missing defaults to 1", which is not so clear to me. I originally thought that the weights are affecting the determination of threshold but not the back-propagation algorithm. However, my naive guess seems to be not correct. To see this, I generated very simple unevenly distributed two classes:

 library(nnet)

 p1 <- 0.05
 p2 <- 1 - p1
 Ntot <- 2000
 class <- sample(1:2,Ntot,prob=c(p1,p2),replace=TRUE)
 dat <- scale(cbind(f1=rnorm(Ntot,mean=class), f2=rnorm(Ntot,mean=class,sd=0.01)))

Then fitted the model with two nnet: one with case weights proportional to its class and another with all weights 1.

 myWeight <- rep(NA,length(class))
 myWeight[class==1] <- p1
 myWeight[class==2] <- p2
 set.seed(1)
 fitw <- nnet(class~.,data=dat,weights=myWeight,size=3,decay=0.1)
 set.seed(1)
 fit0 <- nnet(class~.,data=dat,size=3,decay=0.1)

Now I estimate the response values (ranging between 0 and 1).

 pred.raw.w <- predict(fitw,type="raw")
 pred.raw0 <- predict(fit0,type="raw")

 head(pred.raw.w)
 head(pred.raw0)

If my naive guess was true, I would have seen the same raw response estimates. I see that the two response values are different! This means that the weights must do something to the computation of back-propagation equation (and not just the threshold). Can anyone tell me what exactly weights is doing or direct me to reference?

Was it helpful?

Solution

'case weights' refers to importance weighting of each observation. Weights can be used to tailor the ML algorithm to focus on certain aspects of the data.

Take, for example, a problem of forecasting sales for a store. It might be more important to project sales around weekends and holidays, as the majority of a store's volume is purchased during those times. You can then assign a column of weights that has weekdays as '1' and weekends/holidays as '2'.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top