C＃中的浮子是否有很好的radixsort实现

https://stackoverflow.com/questions/2685035

30-09-2019
|

题

我有一个带有浮动类型的字段的数据架构。这些结构的集合需要按浮子的价值进行排序。是否有Radix-Sort实现。

如果没有，是否有快速的方法来访问指数，标志和mantissa。因为如果您首先在Mantissa，指数和指数上对Floats进行排序。您将漂浮物排序在O（n）中。

解决方案

更新：

我对这个话题非常感兴趣，所以我坐下来实施了（使用这个非常快速的记忆保守实现）。我也读了这个（谢谢塞利安）并发现您甚至不必将浮子分成Mantissa和Exponent即可对其进行分类。您只需要一对一地拿起钻头并执行一种int即可。您只需要关心负值，这些值必须在算法结束时呈负面的值（我在算法的最后一次迭代中迈出了一步，以节省一些CPU时间）。

因此，我的float radixsort：

public static float[] RadixSort(this float[] array)
{
    // temporary array and the array of converted floats to ints
    int[] t = new int[array.Length];
    int[] a = new int[array.Length];
    for (int i = 0; i < array.Length; i++)
        a[i] = BitConverter.ToInt32(BitConverter.GetBytes(array[i]), 0);

    // set the group length to 1, 2, 4, 8 or 16
    // and see which one is quicker
    int groupLength = 4;
    int bitLength = 32;

    // counting and prefix arrays
    // (dimension is 2^r, the number of possible values of a r-bit number) 
    int[] count = new int[1 << groupLength];
    int[] pref = new int[1 << groupLength];
    int groups = bitLength / groupLength;
    int mask = (1 << groupLength) - 1;
    int negatives = 0, positives = 0;

    for (int c = 0, shift = 0; c < groups; c++, shift += groupLength)
    {
        // reset count array 
        for (int j = 0; j < count.Length; j++)
            count[j] = 0;

        // counting elements of the c-th group 
        for (int i = 0; i < a.Length; i++)
        {
            count[(a[i] >> shift) & mask]++;

            // additionally count all negative 
            // values in first round
            if (c == 0 && a[i] < 0)
                negatives++;
        }
        if (c == 0) positives = a.Length - negatives;

        // calculating prefixes
        pref[0] = 0;
        for (int i = 1; i < count.Length; i++)
            pref[i] = pref[i - 1] + count[i - 1];

        // from a[] to t[] elements ordered by c-th group 
        for (int i = 0; i < a.Length; i++){
            // Get the right index to sort the number in
            int index = pref[(a[i] >> shift) & mask]++;

            if (c == groups - 1)
            {
                // We're in the last (most significant) group, if the
                // number is negative, order them inversely in front
                // of the array, pushing positive ones back.
                if (a[i] < 0)
                    index = positives - (index - negatives) - 1;
                else
                    index += negatives;
            }
            t[index] = a[i];
        }

        // a[]=t[] and start again until the last group 
        t.CopyTo(a, 0);
    }

    // Convert back the ints to the float array
    float[] ret = new float[a.Length];
    for (int i = 0; i < a.Length; i++)
        ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

    return ret;
}

它比int radix排序稍慢，因为在功能的开头和末端复制了浮力，将浮子复制到ints和back。然而，整个功能仍然是o（n）。无论如何，比您提出的那样连续排序要快得多。我再也看不到优化的空间了，但是如果有人这样做：随时告诉我。

排序下降的最后一行：

ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

为此：

ret[a.Length - i - 1] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

测量：

我设置了一些简短的测试，其中包含所有特殊情况的浮子（NAN，+/- inf，min/max值，0）和随机数。它与Linq或Linq或 Array.Sort 浮子：

NaN -> -Inf -> Min -> Negative Nums -> 0 -> Positive Nums -> Max -> +Inf

因此，我进行了一项大量10m数字的测试：

float[] test = new float[10000000];
Random rnd = new Random();
for (int i = 0; i < test.Length; i++)
{
    byte[] buffer = new byte[4];
    rnd.NextBytes(buffer);
    float rndfloat = BitConverter.ToSingle(buffer, 0);
    switch(i){
        case 0: { test[i] = float.MaxValue; break; }
        case 1: { test[i] = float.MinValue; break; }
        case 2: { test[i] = float.NaN; break; }
        case 3: { test[i] = float.NegativeInfinity; break; }
        case 4: { test[i] = float.PositiveInfinity; break; }
        case 5: { test[i] = 0f; break; }
        default: { test[i] = test[i] = rndfloat; break; }
    }
}

并停止了不同分类算法的时间：

Stopwatch sw = new Stopwatch();
sw.Start();

float[] sorted1 = test.RadixSort();

sw.Stop();
Console.WriteLine(string.Format("RadixSort: {0}", sw.Elapsed));
sw.Reset();
sw.Start();

float[] sorted2 = test.OrderBy(x => x).ToArray();

sw.Stop();
Console.WriteLine(string.Format("Linq OrderBy: {0}", sw.Elapsed));
sw.Reset();
sw.Start();

Array.Sort(test);
float[] sorted3 = test;

sw.Stop();
Console.WriteLine(string.Format("Array.Sort: {0}", sw.Elapsed));

输出为（更新：现在使用发布构建，而不是调试):

RadixSort: 00:00:03.9902332
Linq OrderBy: 00:00:17.4983272
Array.Sort: 00:00:03.1536785

大约是LINQ的四倍以上。那还不错。但是还没有那么快 Array.Sort, ，但也没有那么糟糕。但是我对此感到非常惊讶：我希望它在很小的阵列上比Linq稍慢。但是后来我进行了20个元素测试：

RadixSort: 00:00:00.0012944
Linq OrderBy: 00:00:00.0072271
Array.Sort: 00:00:00.0002979

甚至这次，我的Radixsort比Linq更快，但是方式比数组排序慢。 :)

更新2：

我做了更多的测量结果，发现了一些有趣的事情：较长的组长度常数意味着更少的迭代和更多的内存使用。如果您使用16位的组长度（只有2个迭代），则在排序小数组时，您的内存额度很高，但是您可以击败 Array.Sort 如果涉及大于大约100k元素的数组，即使不是很大。图表轴都是对数的：

_{（来源： daubmeier.de)}

其他提示

有一个很好的解释，说明如何在此处对浮子进行radix排序：http://www.codercorner.com/radixsortrevisited.htm

如果您的所有值都是积极的，则可以使用二进制表示。该链接说明了如何处理负值。

您可以使用 unsafe 块到纪念或别名 float * 到 uint * 提取碎片。

通过进行一些花哨的铸造和交换数组，而不是复制此版本的速度为10M数字，为Philip daubmeiers原始数量，将Grouplength设置为8。它的数组速度快3倍。

 static public void RadixSortFloat(this float[] array, int arrayLen = -1)
        {
            // Some use cases have an array that is longer as the filled part which we want to sort
            if (arrayLen < 0) arrayLen = array.Length;
            // Cast our original array as long
            Span<float> asFloat = array;
            Span<int> a = MemoryMarshal.Cast<float, int>(asFloat);
            // Create a temp array
            Span<int> t = new Span<int>(new int[arrayLen]);

            // set the group length to 1, 2, 4, 8 or 16 and see which one is quicker
            int groupLength = 8;
            int bitLength = 32;

            // counting and prefix arrays
            // (dimension is 2^r, the number of possible values of a r-bit number) 
            var dim = 1 << groupLength;
            int groups = bitLength / groupLength;
            if (groups % 2 != 0) throw new Exception("groups must be even so data is in original array at end");
            var count = new int[dim];
            var pref = new int[dim];
            int mask = (dim) - 1;
            int negatives = 0, positives = 0;

            // counting elements of the 1st group incuding negative/positive
            for (int i = 0; i < arrayLen; i++)
            {
                if (a[i] < 0) negatives++;
                count[(a[i] >> 0) & mask]++;
            }
            positives = arrayLen - negatives;

            int c;
            int shift;
            for (c = 0, shift = 0; c < groups - 1; c++, shift += groupLength)
            {
                CalcPrefixes();
                var nextShift = shift + groupLength;
                //
                for (var i = 0; i < arrayLen; i++)
                {
                    var ai = a[i];
                    // Get the right index to sort the number in
                    int index = pref[( ai >> shift) & mask]++;
                    count[( ai>> nextShift) & mask]++;
                    t[index] =  ai;
                }

                // swap the arrays and start again until the last group 
                var temp = a;
                a = t;
                t = temp;
            }

            // Last round
            CalcPrefixes();
            for (var i = 0; i < arrayLen; i++)
            {
                var ai = a[i];
                // Get the right index to sort the number in
                int index = pref[( ai >> shift) & mask]++;
                // We're in the last (most significant) group, if the
                // number is negative, order them inversely in front
                // of the array, pushing positive ones back.
                if ( ai < 0) index = positives - (index - negatives) - 1; else index += negatives;
                //
                t[index] =  ai;
            }

            void CalcPrefixes()
            {
                pref[0] = 0;
                for (int i = 1; i < dim; i++)
                {
                    pref[i] = pref[i - 1] + count[i - 1];
                    count[i - 1] = 0;
                }
            }
        }

我认为，如果值不太接近并且有合理的精确要求，您最好的选择，您可以在小数点之前和之后使用实际的float数字进行排序。

例如，您可以只使用前4个小数（无论是0还是不）进行排序。

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow