Question

In the paper Comparison Based Sorting for Systems with Multiple GPUs, the authors describe the selection of a pivot element with respect to the partition on the first GPU (and its mirrored counterpart on the other GPU-partition). That pivot element is crucial for being able to merge the two partitions, given that we have already sorted them on each GPU locally.

However, the pseudo-code for that pivot-selection, as shown in the paper, doesn't seem to reflect the whole truth since when implementing it 1:1, the selected pivot element is off by some elements in some cases, depending on the input - the amount of elements to sort and therefore the amount of elements per partition (the chunk of data that each GPU gets).

To get more specific, the problem is - to my understanding - that the while loop is exited too early due to the stride being reduced down to zero before the correct pivot element has been found. In general, the approach is binary search-like, where the range of where the pivot can fall, is halved each iteration.

Can anyone spot what needs to be done here?

Here is a C++ implementation of the pivot selection:

size_t SelectPivot(const std::vector<int> &a, const std::vector<int> &b)
{
  size_t pivot = a.size() / 2;
  size_t stride = pivot / 2;
  while (stride > 0)
  {
    if (a[a.size() - pivot - 1] < b[pivot])
    {
      if (a[a.size() - pivot - 2] < b[pivot + 1] &&
          a[a.size() - pivot] > b[pivot - 1])
      {
        return pivot;
      }
      else
      {
        pivot = pivot - stride;
      }
    }
    else
    {
      pivot = pivot + stride;
    }
    stride = stride / 2;
  }
  return pivot;
}

P.S.: I tried ceiling the stride in order to not skip iterations when the stride is odd, but this introduced the issue of moving out of bounds of the array and even after handling those cases by clipping to the array bounds, the pivot was not always correct.

Was it helpful?

Solution

The "Algorithm 1 Pivot selection" of that paper is rather sloppy. The critical mistake is, as you have noted, "the while loop in the algorithm is exited too early due to the stride being reduced down to zero before the correct pivot element has been found".

There are two other drawbacks.

  • The condition, a[len(a) − pivot − 2] < b[pivot + 1] is superfluous since it is implied by a[len(a) − pivot − 1] < b[pivot].
  • It does not handle the situation where there are equal elements in the array.

To address that critical mistake, make sure that the stride is always greater than 0. To prevent out-of-bounds exceptions, extreme boundary cases should be specially handled beforehand. To handle equal elements, equal signs can be included at some comparisons.

Here is a correct algorithm in C++ that selects the desired pivot, following the pseudocode in the paper with the aforementioned modifications.

// Return a pivot so that a[size-pivot:size] and b[0:pivot] should
// be swapped. If 0 is returned, no elements should be swapped.
// It is assumed that vector a and b are sorted of the same size.
size_t SelectPivot(const std::vector<int> &a, const std::vector<int> &b) {
  size_t size = a.size();

  if (b[0] >= a[size - 1]) return 0;
  if (a[0] >= b[size - 1]) return size;

  size_t pivot = size / 2;
  size_t stride = std::max(1, pivot / 2);
  while (true) {
      if (a[size - pivot - 1] <= b[pivot]) {
          if (a[size - pivot] > b[pivot - 1]) return pivot;
          else pivot = pivot - stride;
      } else {
          pivot = pivot + stride;
      }

      stride = (stride + 1) / 2;
  }

  return pivot;
}

Note that the method above requires the size of a is equal to that of b. Otherwise, a few more lines of code are needed.

If you are not required to stick to the pseudocode in that paper, which is not very good anyway, please check another version of my answer, where I use my preferred version of binary search to select the desired pivot.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top