Question

I have a IEnumerable<double> data sample. I want to compute the 90% confidence interval for the signal/data. I have MathNET library at my disposal, but I am confused as to how to correctly work with the library. Given my data, the idea is to return two additional data arrays that contain the original signal's confidence intervals

using MathNet.Numerics.Statistics;
using MathNet.Numerics.Distributions;

public static List<double[]> ConfidenceIntervals(IEnumerable<double> sample, double interval)
{
    Contract.Requires(interval > 0 && interval < 1.0);
    int sampleSize = sample.Count();
    double alpha = 1.0 - interval;
    double mean = sample.Mean();
    double sd = sample.StandardDeviation();

    double t, mu;
    double[] upper = new double[sampleSize];
    double[] lower = new double[sampleSize];
    StudentT studentT = new StudentT(mean, alpha, sampleSize - 1);
    int index = 0;
    foreach (double d in sample)
    {
        t = studentT.CumulativeDistribution(d);
        double tmp = t * (sd / Math.Sqrt(sampleSize));
        mu = mean - tmp;
        upper[index] = d + mu;
        lower[index] = d - mu;
    }
    return new List<double[]>() { upper, lower };
}

This really is not complex in terms of mathematics, I am just confused as to how to correctly use the functions/methods available to me in the MathNET library.

Was it helpful?

Solution

I'm not entirely sure I understand how the confidence interval of the signal is supposed to be applied to each sample of the signal, but we can compute the confidence interval of the sample set as follows:

public static Tuple<double, double> A(double[] samples, double interval)
{
    double theta = (interval + 1.0)/2;
    double mean = samples.Mean();
    double sd = samples.StandardDeviation();
    double T = StudentT.InvCDF(0,1,samples.Length-1,theta);
    double t = T * (sd / Math.Sqrt(samples.Length));
    return Tuple.Create(mean-t, mean+t);
}

Except that the line where we compute T does not compile because unfortunately there is no StudentT.InvCDF in current Math.NET Numerics yet. But we can still evaluate it numerically as a workaround in the meantime:

var student = new StudentT(0,1,samples.Length-1);
double T = FindRoots.OfFunction(x => student.CumulativeDistribution(x)-theta,-800,800);

For example, with 16 samples and alpha 0.05 we get 2.131 as expected. If there are more than ~60-100 samples, this can also be approximated with the normal distribution:

double T = Nomal.InvCDF(0,1,theta);

So all in all:

public static Tuple<double, double> B(double[] samples, double interval)
{
    double theta = (interval + 1.0)/2;
    double T = FindRoots.OfFunction(x => StudentT.CDF(0,1,samples.Length-1,x)-theta,-800,800);

    double mean = samples.Mean();
    double sd = samples.StandardDeviation();
    double t = T * (sd / Math.Sqrt(samples.Length));
    return Tuple.Create(mean-t, mean+t);
}

This is not the full answer yet as I understand you wanted to somehow apply the confidence interval to each sample, but hopefully it helps on the way to get there.

PS: Using Math.NET Numerics v3.0.0-alpha7

OTHER TIPS

I noticed that you didn't increase the index value in foreach loop. This will make the value at index 0 is replaced by the next calculation (When you try to set upper[index] and lower[index] values).

So I guess this is a reason why you got the incorrect results.

If so, your code should be

using MathNet.Numerics.Statistics;
using MathNet.Numerics.Distributions;

public static List<double[]> ConfidenceIntervals(IEnumerable<double> sample, double interval)
{
    Contract.Requires(interval > 0 && interval < 1.0);
    int sampleSize = sample.Count();
    double alpha = 1.0 - interval;
    double mean = sample.Mean();
    double sd = sample.StandardDeviation();

    double t, mu;
    double[] upper = new double[sampleSize];
    double[] lower = new double[sampleSize];
    StudentT studentT = new StudentT(mean, alpha, sampleSize - 1);
    int index = 0;
    foreach (double d in sample)
    {
        t = studentT.CumulativeDistribution(d);
        double tmp = t * (sd / Math.Sqrt(sampleSize));
        mu = mean - tmp;
        upper[index] = d + mu;
        lower[index] = d - mu;
        index++;
    }
    return new List<double[]>() { upper, lower };
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top