Second order gradient in numpy

Question 1

There's no universal right answer for numerical gradient calculation. Before you can calculate the gradient about sample data, you have to make some assumption about the underlying function that generated that data. You can technically use np.diff for gradient calculation. Using np.gradient is a reasonable approach. I don't see anything fundamentally wrong with what you are doing---it's one particular approximation of the 2nd derivative of a 1-D function.

Question 2

I'll second @jrennie's first sentence - it can all depend. The numpy.gradient function requires that the data be evenly spaced (although allows for different distances in each direction if multi-dimensional). If your data does not adhere to this, than numpy.gradient isn't going to be much use. Experimental data may have (OK, will have) noise on it, in addition to not necessarily being all evenly spaced. In this case it might be better to use one of the scipy.interpolate spline functions (or objects). These can take unevenly spaced data, allow for smoothing, and can return derivatives up to k-1 where k is the order of the spline fit requested. The default value for k is 3, so a second derivative is just fine. Example:

spl = scipy.interpolate.splrep(x,y,k=3) # no smoothing, 3rd order spline
ddy = scipy.interpolate.splev(x,spl,der=2) # use those knots to get second derivative

The object oriented splines like scipy.interpolate.UnivariateSpline have methods for the derivatives. Note that the derivative methods are implemented in Scipy 0.13 and are not present in 0.12.

Note that, as pointed out by @JosephCottham in comments in 2018, this answer (good for Numpy 1.08 at least), is no longer applicable since (at least) Numpy 1.14. Check your version number and the available options for the call.

Question 3

The double gradient approach fails for discontinuities in the first derivative. As the gradient function takes one data point to the left and to the right into account, this continues/spreads when applying it multiple times.

On the other hand side, the second derivative can be calculated by the formula

d^2 f(x[i]) / dx^2 = (f(x[i-1]) - 2*f(x[i]) + f(x[i+1])) / h^2

compare here. This has the advantage to just take the two neighboring pixels into account.

In the picture the double np.gradient approach (left) and the above mentioned formula (right), as implemented by np.diff are compared. As f(x) has only one kink at zero, the second derivative (green) should only there have a peak. As the double gradient solution takes 2 neighboring points in each direction into account, this leads to finite second derivative values at +/- 1.

In some cases, however, you may want to prefer the double gradient solution, as this is more robust to noise.

I am not sure why there is np.gradient and np.diff, but a reason might be, that the second argument of np.gradient defines the pixel distance (for each dimension) and for images it can be applied for both dimensions simultaneously gy, gx = np.gradient(a).

Code

import numpy as np
import matplotlib.pyplot as plt

xs = np.arange(-5,6,1)
f = np.abs(xs)
f_x = np.gradient(f)
f_xx_bad = np.gradient(f_x)
f_xx_good = np.diff(f, 2)
test = f[:-2] - 2* f[1:-1] + f[2:]





# lets plot all this

fig, axs = plt.subplots(1, 2, figsize=(9, 3), sharey=True)

ax = axs[0]
ax.set_title('bad: double gradient')
ax.plot(xs, f, marker='o', label='f(x)')
ax.plot(xs, f_x, marker='o', label='d f(x) / dx')
ax.plot(xs, f_xx_bad, marker='o', label='d^2 f(x) / dx^2')
ax.legend()

ax = axs[1]
ax.set_title('good: diff with n=2')
ax.plot(xs, f, marker='o', label='f(x)')
ax.plot(xs, f_x, marker='o', label='d f(x) / dx')
ax.plot(xs[1:-1], f_xx_good, marker='o', label='d^2 f(x) / dx^2')
ax.plot(xs[1:-1], test, marker='o', label='test', markersize=1)
ax.legend()

Question 4

As I keep stepping over this problem in one form or the other again and again, I decided to write a function gradient_n, which adds an differentiation oder functionality to np.gradient. Not all functionalities of np.gradient are supported, like differentiation of mutiple axis.

Like np.gradient, gradient_n returns the differentiated result in the same shape as the input. Also a pixel distance argument (d) is supported.

import numpy as np

def gradient_n(arr, n, d=1, axis=0):
    """Differentiate np.ndarray n times.

    Similar to np.diff, but additional support of pixel distance d
    and padding of the result to the same shape as arr.

    If n is even: np.diff is applied and the result is zero-padded
    If n is odd: 
        np.diff is applied n-1 times and zero-padded.
        Then gradient is applied. This ensures the right output shape.
    """
    n2 = int((n // 2) * 2)
    diff = arr

    if n2 > 0:
        a0 = max(0, axis)
        a1 = max(0, arr.ndim-axis-1)
        diff = np.diff(arr, n2, axis=axis) / d**n2
        diff = np.pad(diff, tuple([(0,0)]*a0 + [(1,1)] +[(0,0)]*a1),
                    'constant', constant_values=0)

    if n > n2:
        assert n-n2 == 1, 'n={:f}, n2={:f}'.format(n, n2)
        diff = np.gradient(diff, d, axis=axis)

    return diff

def test_gradient_n():
    import matplotlib.pyplot as plt

    x = np.linspace(-4, 4, 17)
    y = np.linspace(-2, 2, 9)
    X, Y = np.meshgrid(x, y)
    arr = np.abs(X)
    arr_x = np.gradient(arr, .5, axis=1)
    arr_x2 = gradient_n(arr, 1, .5, axis=1)
    arr_xx = np.diff(arr, 2, axis=1) / .5**2
    arr_xx = np.pad(arr_xx, ((0, 0), (1, 1)), 'constant', constant_values=0)
    arr_xx2 = gradient_n(arr, 2, .5, axis=1)

    assert np.sum(arr_x - arr_x2) == 0
    assert np.sum(arr_xx - arr_xx2) == 0

    fig, axs = plt.subplots(2, 2, figsize=(29, 21))
    axs = np.array(axs).flatten()

    ax = axs[0]
    ax.set_title('x-cut')
    ax.plot(x, arr[0, :], marker='o', label='arr')
    ax.plot(x, arr_x[0, :], marker='o', label='arr_x')
    ax.plot(x, arr_x2[0, :], marker='x', label='arr_x2', ls='--')
    ax.plot(x, arr_xx[0, :], marker='o', label='arr_xx')
    ax.plot(x, arr_xx2[0, :], marker='x', label='arr_xx2', ls='--')
    ax.legend()

    ax = axs[1]
    ax.set_title('arr')
    im = ax.imshow(arr, cmap='bwr')
    cbar = ax.figure.colorbar(im, ax=ax, pad=.05)

    ax = axs[2]
    ax.set_title('arr_x')
    im = ax.imshow(arr_x, cmap='bwr')
    cbar = ax.figure.colorbar(im, ax=ax, pad=.05)

    ax = axs[3]
    ax.set_title('arr_xx')
    im = ax.imshow(arr_xx, cmap='bwr')
    cbar = ax.figure.colorbar(im, ax=ax, pad=.05)

test_gradient_n()

Question 5

This is an excerpt from the original documentation (at the time of writing found at http://docs.scipy.org/doc/numpy/reference/generated/numpy.gradient.html). It states that unless the sampling distance is 1 you need to include a list containing the distances as an argument.

numpy.gradient(f, *varargs, **kwargs)
Return the gradient of an N-dimensional array.

The gradient is computed using second order accurate central differences in the interior and either first differences or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.

Parameters:
f : array_like An N-dimensional array containing samples of a scalar function.

varargs : list of scalar, optional N scalars specifying the sample distances for each dimension, i.e. dx, dy, dz, ... Default distance: 1.

edge_order : {1, 2}, optional Gradient is calculated using Nth order accurate differences at the boundaries. Default: 1. New in version 1.9.1.

Returns:
gradient : ndarray N arrays of the same shape as f giving the derivative of f with respect to each dimension.

Question 6

My solution is to create a function similar to np.gradient that calculates the 2nd derivatives numerically from the array data.

import numpy as np

def gradient2_even(y, h=None, edge_order=1):
    """
    Return the 2nd-order gradient i.e. 
    2nd derivatives of y with n samples and k components.
    
    The 2nd-order gradient is computed using second-order-accurate central differences
    in the interior points and either first or second order accurate one-sided
    (forward or backwards) differences at the boundaries.
    The returned gradient hence has the same shape as the input array.
    
    Parameters
    ----------
    y : 1d or 2d array_like
        The array containing the samples. If 2d with shape (n,k),
        n is the number of samples at least 2 while k is the number of
        y series/components. 1d input is equivalent to 2d input with shape (n,1).
    h : constant or 1d, optional
        spacing between the y samples. Default unitary spacing for
        all y components. Spacing can be specified using:
            
            1. Single scalar spacing value for all y components.
            2. 1d array_like of length k specifying the spacing for each y component
    
    edge_order : {1, 2}, optional
                 Order 1 means 3-point forward/backward finite differences
                 are used to calculate the 2nd derivatves at the edge points while
                 order 2 uses 4-point forward/backward finite differences.
    
    Returns
    ----------
    d2y : 1d or 2d array
        Array containing the 2nd derivatives. The output shape is the same as y.
    """
    if edge_order!=1 and edge_order!=2:
        raise ValueError('edge_order must be 1 or 2.')
    else:
        pass
    
    y = np.asfarray(y)
    origshape = y.shape
    if y.ndim!=1 and y.ndim!=2:
        raise ValueError('y can only be 1d or 2d.')
    elif y.ndim==1:
        y = np.atleast_2d(y).T
    elif y.ndim==2:
        if y.shape[0]<2:
            raise ValueError('The number of y samples must be atleast 2.')
        else:
            pass
    else:
        pass
    n,k = y.shape
    
    if h is None:
        h = 1.0
    else:
        h = np.asfarray(h)
        if h.ndim!=0 and h.ndim!=1:
            raise ValueError('h can only be 0d or 1d.')
        elif h.ndim==0:
            pass
        elif h.ndim==1 and h.size!=n:
            raise ValueError('If h is 1d, it must have the same number as the components of y.')
        else:
            pass
    d2y = np.zeros_like(y)
    if n==2:
        pass
    elif n==3:
        d2y[:] = ( 1/h**2 * (y[0] - 2*y[1] + y[2]) )
    else:
        d2y = np.zeros_like(y)
        d2y[1:-1]=1/h**2 * ( y[:-2] - 2*y[1:-1] + y[2:] )
        if edge_order==1:
            d2y[0]=1/h**2 * ( y[0] - 2*y[1] + y[2] )
            d2y[-1]=1/h**2 * ( y[-1] - 2*y[-2] + y[-3] )
        else:
            d2y[0]=1/h**2 * ( 2*y[0] - 5*y[1] + 4*y[2] - y[3] )
            d2y[-1]=1/h**2 * ( 2*y[-1] - 5*y[-2] + 4*y[-3] - y[-4] )
    return d2y.reshape(origshape)

Using your example,

# After importing the function from the script file or running it
from numpy import *
from matplotlib.pyplot import *

x, h = linspace(0, 10, 17) # use a fairly coarse grid to see the discrepancies better
y = sin(x)
ypp = -sin(x) # analytical 2nd derivatives

# Compute numerically the 2nd derivatives using 2nd-order finite differences at the edge points

d2y = gradient2_even(y, h, 2)

# Compute numerically the 2nd derivatives using nested gradient function
d2y2 = gradient(gradient(y, h, edge_order=2), h, edge_order=2)

# Compute numerically the 2nd derivatives using 1st-order finite differences at the edge points
d2y3 = gradient2_even(y, h, 1)

fig,ax=subplots(1,1)
ax.plot(x, ypp, x, d2y, 'o', x, d2y2, 'o', x, d2y3, 'o'), ax.grid()
ax.legend(['Analytical', 'edge_order=2', 'nested gradient', 'edge_order=1'])
fig.tight_layout()