Efficiently Calculating a Euclidean Distance Matrix Using Numpy

Question 1

You can take advantage of the complex type :

# build a complex array of your cells
z = np.array([complex(c.m_x, c.m_y) for c in cells])

First solution

# mesh this array so that you will have all combinations
m, n = np.meshgrid(z, z)
# get the distance via the norm
out = abs(m-n)

Second solution

Meshing is the main idea. But numpy is clever, so you don't have to generate m & n. Just compute the difference using a transposed version of z. The mesh is done automatically :

out = abs(z[..., np.newaxis] - z)

Third solution

And if z is directly set as a 2-dimensional array, you can use z.T instead of the weird z[..., np.newaxis]. So finally, your code will look like this :

z = np.array([[complex(c.m_x, c.m_y) for c in cells]]) # notice the [[ ... ]]
out = abs(z.T-z)

Example

>>> z = np.array([[0.+0.j, 2.+1.j, -1.+4.j]])
>>> abs(z.T-z)
array([[ 0.        ,  2.23606798,  4.12310563],
       [ 2.23606798,  0.        ,  4.24264069],
       [ 4.12310563,  4.24264069,  0.        ]])

As a complement, you may want to remove duplicates afterwards, taking the upper triangle :

>>> np.triu(out)
array([[ 0.        ,  2.23606798,  4.12310563],
       [ 0.        ,  0.        ,  4.24264069],
       [ 0.        ,  0.        ,  0.        ]])

Some benchmarks

>>> timeit.timeit('abs(z.T-z)', setup='import numpy as np;z = np.array([[0.+0.j, 2.+1.j, -1.+4.j]])')
4.645645342274779
>>> timeit.timeit('abs(z[..., np.newaxis] - z)', setup='import numpy as np;z = np.array([0.+0.j, 2.+1.j, -1.+4.j])')
5.049334864854522
>>> timeit.timeit('m, n = np.meshgrid(z, z); abs(m-n)', setup='import numpy as np;z = np.array([0.+0.j, 2.+1.j, -1.+4.j])')
22.489568296184686

Question 2

If you don't need the full distance matrix, you will be better off using kd-tree. Consider scipy.spatial.cKDTree or sklearn.neighbors.KDTree. This is because a kd-tree kan find k-nearnest neighbors in O(n log n) time, and therefore you avoid the O(n**2) complexity of computing all n by n distances.

Question 3

Jake Vanderplas gives this example using broadcasting in Python Data Science Handbook, which is very similar to what @shx2 proposed.

import numpy as np
rand = random.RandomState(42)
X = rand.rand(3, 2)  
dist_sq = np.sum((X[:, np.newaxis, :] - X[np.newaxis, :, :]) ** 2, axis = -1)

dist_sq
array([[0.        , 0.18543317, 0.81602495],
       [0.18543317, 0.        , 0.22819282],
       [0.81602495, 0.22819282, 0.        ]])

Question 4

Here is how you can do it using numpy:

import numpy as np

x = np.array([0,1,2])
y = np.array([2,4,6])

# take advantage of broadcasting, to make a 2dim array of diffs
dx = x[..., np.newaxis] - x[np.newaxis, ...]
dy = y[..., np.newaxis] - y[np.newaxis, ...]
dx
=> array([[ 0, -1, -2],
          [ 1,  0, -1],
          [ 2,  1,  0]])

# stack in one array, to speed up calculations
d = np.array([dx,dy])
d.shape
=> (2, 3, 3)

Now all is left is computing the L2-norm along the 0-axis (as discussed here):

(d**2).sum(axis=0)**0.5
=> array([[ 0.        ,  2.23606798,  4.47213595],
          [ 2.23606798,  0.        ,  2.23606798],
          [ 4.47213595,  2.23606798,  0.        ]])

Question 5

If you are looking for the most efficient way of computation - use SciPy's cdist() (or pdist() if you need just vector of pairwise distances instead of full distance matrix) as suggested in Tweakimp's comment. As he said it's a lot faster than method based on vectorization and broadcasting, proposed by RichPauloo and shx2. The reason for that is that SciPy's cdist() and pdist() under the hood use for loop and C implementations for metric computations, which are even faster than vectorization.

By the way, if you can use SciPy and still prefer method using broadcasting, you don't have to implement it by yourself, as distance_matrix() function is pure Python implementation, which leverages broadcasting and vectorization (source code, docs).

It's worth mentioning that cdist()/pdist() is also more efficient than broadcasting memory-wise, as it computes distances one by one and avoids creating arrays of n*n*d elements, where n is number of points and d is points' dimensionality.

Experiments

I've conducted some simple experiments to compare performance of SciPy's cdist(), distance_matrix() and broadcasting implementation in NumPy. I used perf_counter_ns() from Python's time module to measure time and all the results are averaged over 10 runs on 10000 points in 2D space using np.float64 datatype (tested on Python 3.8.10, Windows 10 with Ryzen 2700 and 16 GB RAM):

cdist() - 0.6724s
distance_matrix() - 3.0128s
my NumPy implementation - 3.6931s

Code if someone wants to reproduce experiments:

from scipy.spatial import *
import numpy as np
from time import perf_counter_ns


def dist_mat_custom(a, b):
    return np.sqrt(np.sum(np.square(a[:, np.newaxis, :] - b[np.newaxis, :, :]), axis=-1))


results = []
size = 10000
it_num = 10
for i in range(it_num):
    a = np.random.normal(size=(size, 2))
    b = np.random.normal(size=(size, 2))
    start = perf_counter_ns()
    c = distance_matrix(a, b)
    #c = dist_mat_custom(a, b)
    #c = distance.cdist(a, b)
    results.append(perf_counter_ns() - start)
print(np.mean(results) / 1e9)