Вопрос

I am working with small sample size data:

>dput(dat.demand2050.unique)  
c(79, 56, 69, 61, 53, 73, 72, 86, 75, 68, 74.2, 80, 65.6, 60, 54)    

for which the density distribution looks like this:
pdf of data

I know that the values are from two regimes - low and high - and assuming that the underlying process is normal, I used the mixtools package to fit a bimodal distribution:

set.seed(99)  
dat.demand2050.mixmdl <- normalmixEM(dat.demand2050.unique, lambda=c(0.3,0.7), mu=c(60,70), k=2)

which gives me the following result:
enter image description here
(the solid lines are fitted curves and the dashed line is the original density).

# get the parameters of the mixture
dat.demand2050.mixmdl.prop <- dat.demand2050.mixmdl$lambda    #mix proportions
dat.demand2050.mixmdl.means <- dat.demand2050.mixmdl$mu    #modal means
dat.demand2050.mixmdl.dev <- dat.demand2050.mixmdl$sigma   #modal std dev  

The mixture parameters are:

>dat.demand2050.mixmdl.prop  #mix proportions  
[1] 0.2783939 0.7216061  
>dat.demand2050.mixmdl.means  #modal means  
[1] 56.21150 73.08389  
>dat.demand2050.mixmdl.dev  #modal std dev  
[1] 3.098292 6.413906 

I have the following questions:

  1. To generate a new set of values that approximates the underlying distribution, is my approach correct or is there a better workflow?
  2. If my approach is correct, how can I use this result to generate a set of random values from this mixed distribution?
Это было полезно?

Решение

Your sample size is a bit dubious to be fitting mixtures, but never mind that. You can sample from the fitted mixture as follows:

probs <- dat.demand2050.mixmdl$lambda
m <- dat.demand2050.mixmdl$mu
s <- at.demand2050.mixmdl$sigma

N <- 1e5
grp <- sample(length(probs), N, replace=TRUE, prob=probs)
x <- rnorm(N, m[grp], s[grp])

Другие советы

Your approach is correct.

For each sample from your mixed distribution you just need to choose which of the two component Gaussian distributions the sample should come from and then draw the sample from that distribution.

You can choose between the two distributions using the mixture proportions you have found: simulate a random number between 0 and 1 and sample from the first distribution if it the random number is less than the first proportion, otherwise sample from the second distribution.

Finally, sample from the relevant Gaussian distribution using the rnorm function.

dat.demand2050.mixmdl.prop=c(0.2783939,0.7216061)
dat.demand2050.mixmdl.means=c(56.21150,73.08389)
dat.demand2050.mixmdl.dev=c(3.098292,6.413906)

sampleMixture=function(prop,means,dev){
    # Generate a uniformly distributed random number between 0 and 1
    # in order to choose between the two component distributions
    distTest=runif(1)
    if(distTest<prop[1]){
        # Then sample from the first component of the mixture
        sample=rnorm(1,mean=means[1],sd=dev[1])
    }else{
        # Sample from the second component of the mixture
        sample=rnorm(1,mean=means[2],sd=dev[2])
    }
    return(sample)
}

# Generate a single sample
sampleMixture(dat.demand2050.mixmdl.prop,dat.demand2050.mixmdl.means,dat.demand2050.mixmdl.dev)

# Generate 100 samples and plot resulting distribution
samples=replicate(100,sampleMixture(dat.demand2050.mixmdl.prop,dat.demand2050.mixmdl.means,dat.demand2050.mixmdl.dev))
plot(density(samples))
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top