You are using a quadratic algorithm:
project [] = error "Empty list of points"
project [_] = error "Single point is given"
project ps = go 10000 ps
where
go a [_] = a
go a (p:ps) = let a2 = min a $ minimum [distance p q | q<-ps]
in a2 `seq` go a2 ps
You should use a better algorithm. Search computational-geometry tag on SO for a better algorithm.
See also http://en.wikipedia.org/wiki/Closest_pair_of_points_problem .
@maxtaldykin proposes a nice, simple and effective change to the algorithm, which should make a real difference for random data -- pre-sort the points by X coordinate, and never try points more than d
units away from the current point, in X coordinate (where d
is the currently known minimal distance):
import Data.Ord (comparing)
import Data.List (sortBy)
project2 ps@(_:_:_) = go 10000 p1 t
where
(p1:t) = sortBy (comparing fst) ps
go d _ [] = d
go d p1@(x1,_) t = g2 d t
where
g2 d [] = go d (head t) (tail t)
g2 d (p2@(x2,_):r)
| x2-x1 >= d = go d (head t) (tail t)
| d2 >= d = g2 d r
| otherwise = g2 d2 r -- change it "mid-flight"
where
d2 = distance p1 p2
On random data, g2
will work in O(1)
time so that go
will take O(n)
and the whole thing will be bounded by a sort, ~ n log n
.
Empirical orders of growth show ~ n^2.1
for the first code (on 1k/2k range) and ~n^1.1
for the second, on 10k/20k range, testing it quick'n'dirty compiled-loaded into GHCi (with second code running 50 times faster than first for 2,000 points, and 80 times faster for 3,000 points).