Negating the truthiness of a boolean evaluation causing 5x slowdown?

https://stackoverflow.com/questions/3731481

03-10-2019
|

Question

I'm trying to implement an octree, and for that, I need a fast AABB-ray intersection algorithm. After some searching, I came across this paper that seemed to offer that. From the source code, available here, I translated the pluecker_cls_cff function to C# as this:

public bool Intersect_2(ref RayPluecker r)
{
  switch (r.Classification)
  {

    // 7 same-ish cases snipped

    case Classification.PPP:

      return !((r.Position.X > this.Max.X) || (r.Position.Y > this.Max.Y) || (r.Position.Z > this.Max.Z) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Max.Y - r.Direction.Y * this.Min.X < 0) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Min.Y - r.Direction.Y * this.Max.X > 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Min.Z - r.Direction.Z * this.Max.X > 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Max.Z - r.Direction.Z * this.Min.X < 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Min.Y + r.Direction.Y * this.Max.Z < 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Max.Y + r.Direction.Y * this.Min.Z > 0));
  }

  return false;
}

This seems to work fine, but it seemed fairly slow to me (250ms to do 10 million intersects) so I tried some micro-benchmarking with different varieties. In one, I removed the negation that is right after the return statement and reversed all comparisons (> to < and visa versa).

It's now:

case Classification.PPP:

      return ((r.Position.X < this.Max.X) || (r.Position.Y < this.Max.Y) || (r.Position.Z < this.Max.Z) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Max.Y - r.Direction.Y * this.Min.X > 0) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Min.Y - r.Direction.Y * this.Max.X < 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Min.Z - r.Direction.Z * this.Max.X < 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Max.Z - r.Direction.Z * this.Min.X > 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Min.Y + r.Direction.Y * this.Max.Z > 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Max.Y + r.Direction.Y * this.Min.Z < 0));

This should give the same result, right? It seemed so, as it returns the same results as the negated version with a couple of test cases. However, in the benchmark, it was 5x faster (50ms to do 10 million intersects)! I'm sure it wasn't being optimized out, my benchmark looks like this:

for (int i = 0; i < 10000000; i++)
{
  if (!box.Intersect_3(ref ray))
  {
    throw new Exception();
  }
}

What can explain this huge difference? I'm running .NET 4.0 on x86.

Solution

Your second code doesn't do the same thing as your first.

In addition to the changes you already made, you need to turn all your ORs into ANDs. (See De Morgan's Laws.)

I'll bet that after you make the fix, your two versions will run at the same speed.

OTHER TIPS

Specifically related to performance, I bet the return statement is short-circuiting sooner in the second case than the first. It may be worth trying to change the order of the comparisons if some are more likely than others to be true. If you change the calculations to be equivalent using && instead of || in the second case, then you would want the ones that are most likely to be false to come first.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow