Question

I have 2 arrays of equal length. The following function attempts to calculate the slope using these arrays. It returns the average of the slope between each points. For the following data set, I seem to be getting different values than Excel and Google Docs.

        double[] x_values = { 1932, 1936, 1948, 1952, 1956, 1960, 1964, 1968,
            1972, 1976, 1980 };
    double[] y_values = { 197, 203, 198, 204, 212, 216, 218, 224, 223, 225,
            236 };



public static double getSlope(double[] x_values, double[] y_values)
        throws Exception {

    if (x_values.length != y_values.length)
        throw new Exception();

    double slope = 0;

    for (int i = 0; i < (x_values.length - 1); i++) {
        double y_2 = y_values[i + 1];
        double y_1 = y_values[i];

        double delta_y = y_2 - y_1;

        double x_2 = x_values[i + 1];
        double x_1 = x_values[i];

        double delta_x = x_2 - x_1;

        slope += delta_y / delta_x;
    }

    System.out.println(x_values.length);
    return slope / (x_values.length);
}

Output

Google: 0.755

getSlope(): 0.962121212121212

Excel: 0.7501

Was it helpful?

Solution

I bet the other two methods are computing the least-squares fit, whereas you are not.

When I verify this conjecture using R, I too get the slope of about 0.755:

> summary(lm(y~x))

Call:
lm(formula = y ~ x)

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.265e+03  1.793e+02  -7.053 5.97e-05 ***
x            7.551e-01  9.155e-02   8.247 1.73e-05 ***

The relevant number is the 7.551e-01. It is also worth noting that the line has an intercept of about -1265.

Here is a picture of the least-squares fit:

lm fit

As to implementing this in your code, see Compute least squares using java

OTHER TIPS

This function will not help you much, as it does not take into account the breadths of the various line segments. Consider the differences in applying it to the points (0,0), (1000,1000), and (1001, 2000) versus (0,0), (1,1), and (2, 1001). Both cases have successive slopes 1 and 1000, yet they look greatly different.

You need to implement the method of least squares: http://en.wikipedia.org/wiki/Least_squares to find the line that best approximates your data set.

One more piece of advice: never throw a java.lang.Exception. Always choose a more-specific exception, even if you must write the class yourself. People using your code will need to handle java.lang.Exception, which interferes badly with their other code.

Edit: use Apache Commons Math class SimpleRegression if that's an option. Else, here's a method that calculates slope and also intercept, should yield the same results as excel and apache:

private static double intercept(List<Double> yList, List<Double> xList) {
    if (yList.size() != xList.size())
        throw new IllegalArgumentException("Number of y and x must be the same");
    if (yList.size() < 2)
        throw new IllegalArgumentException("Need at least 2 y, x");

    double yAvg = average(yList);
    double xAvg = average(xList);

    double sumNumerator = 0d;
    double sumDenominator = 0d;
    for (int i = 0; i < yList.size(); i++) {
        double y = yList.get(i);
        double x = xList.get(i);
        double yDiff = y - yAvg;
        double xDiff = x - xAvg;
        double numerator = xDiff * yDiff;
        double denominator = xDiff * xDiff;
        sumNumerator += numerator;
        sumDenominator += denominator;
    }

    double slope = sumNumerator / sumDenominator;
    double intercept = yAvg - (slope * xAvg);
    return intercept;
}

private static double average(Collection<Double> doubles) {
    return doubles.stream().collect(Collectors.averagingDouble(d -> d));
}

Sources: Excel doc for SLOPE Excel doc for INTERCEPT

You should be dividing by x_values.length - 1 . Number of slopes is pairwise.

Edit : Wiki example in my comments shows how to calculate the alpha and beta which determines the slope of the linear regression line.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top