Gibt es eine gute RadixSort-Implementierung für Schwimmer in C #

https://stackoverflow.com/questions/2685035

30-09-2019
|

Frage

habe ich eine Datenstruktur mit einem Feld des Schwimmers-Typs. Eine Sammlung dieser Strukturen muss durch den Wert des Schwimmers sortiert werden. Gibt es eine Radix-Sortier Implementierung für diese.

Wenn nicht, gibt es eine schnelle Möglichkeit, die Exponenten für den Zugriff auf das Vorzeichen und die Mantisse. Denn wenn man die Art schwimmt zuerst auf Mantisse, Exponent, und Exponenten der letzten Zeit. Sie schwimmt Art in O (n).

Lösung

Update:

Ich war ganz in diesem Thema interessiert sind, so setzte ich mich hin und setzte es (unter Verwendung von dies sehr schnell und Gedächtnis konservative Umsetzung ). Ich habe auch gelesen diese eine (danke celion ) und fand heraus, dass Sie auch nicht, daß die Schwimmer in Mantisse und Exponent aufgeteilt haben, um es zu sortieren. Sie müssen nur die Bits Eins-zu-Eins-nehmen und eine int Art auszuführen. Sie müssen nur über die negativen Werte kümmern, die umgekehrt vor den positiven am Ende des Algorithmus gesetzt werden muß (ich gemacht, dass in einem Schritt mit der letzten Iteration des Algorithmus einig CPU-Zeit zu sparen).

So herer mein Schwimmer RadixSort:

public static float[] RadixSort(this float[] array)
{
    // temporary array and the array of converted floats to ints
    int[] t = new int[array.Length];
    int[] a = new int[array.Length];
    for (int i = 0; i < array.Length; i++)
        a[i] = BitConverter.ToInt32(BitConverter.GetBytes(array[i]), 0);

    // set the group length to 1, 2, 4, 8 or 16
    // and see which one is quicker
    int groupLength = 4;
    int bitLength = 32;

    // counting and prefix arrays
    // (dimension is 2^r, the number of possible values of a r-bit number) 
    int[] count = new int[1 << groupLength];
    int[] pref = new int[1 << groupLength];
    int groups = bitLength / groupLength;
    int mask = (1 << groupLength) - 1;
    int negatives = 0, positives = 0;

    for (int c = 0, shift = 0; c < groups; c++, shift += groupLength)
    {
        // reset count array 
        for (int j = 0; j < count.Length; j++)
            count[j] = 0;

        // counting elements of the c-th group 
        for (int i = 0; i < a.Length; i++)
        {
            count[(a[i] >> shift) & mask]++;

            // additionally count all negative 
            // values in first round
            if (c == 0 && a[i] < 0)
                negatives++;
        }
        if (c == 0) positives = a.Length - negatives;

        // calculating prefixes
        pref[0] = 0;
        for (int i = 1; i < count.Length; i++)
            pref[i] = pref[i - 1] + count[i - 1];

        // from a[] to t[] elements ordered by c-th group 
        for (int i = 0; i < a.Length; i++){
            // Get the right index to sort the number in
            int index = pref[(a[i] >> shift) & mask]++;

            if (c == groups - 1)
            {
                // We're in the last (most significant) group, if the
                // number is negative, order them inversely in front
                // of the array, pushing positive ones back.
                if (a[i] < 0)
                    index = positives - (index - negatives) - 1;
                else
                    index += negatives;
            }
            t[index] = a[i];
        }

        // a[]=t[] and start again until the last group 
        t.CopyTo(a, 0);
    }

    // Convert back the ints to the float array
    float[] ret = new float[a.Length];
    for (int i = 0; i < a.Length; i++)
        ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

    return ret;
}

Es ist etwas langsamer als ein int Radixsort, wegen der Anordnung am Anfang und Ende der Funktion Kopieren, wo der Schwimmer bitweise kopiert wird ints und zurück. Die gesamte Funktion ist doch wieder O (n). Auf jeden Fall viel schneller als Sortier 3 mal in einer Reihe, wie Sie vorgeschlagen. Ich habe nicht mehr für Optimierungen viel Raum sehen, aber wenn jemand tut. Fühlen Sie sich frei, mir zu sagen

Um sortieren Änderung am Ende diese Zeile absteigend:

ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

folgt aus:

ret[a.Length - i - 1] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

Mess:

stelle ich einige Kurztest auf, die alle Sonderfälle von Schwimmern (NaN, +/- Inf, Min / Max-Wert, 0) und Zufallszahlen. Es sortiert genau die gleiche Reihenfolge wie Linq oder Array.Sort Sorten Schwimmern:

NaN -> -Inf -> Min -> Negative Nums -> 0 -> Positive Nums -> Max -> +Inf

Also ich einen Test mit einer riesigen Auswahl an 10M Zahlen lautete:

float[] test = new float[10000000];
Random rnd = new Random();
for (int i = 0; i < test.Length; i++)
{
    byte[] buffer = new byte[4];
    rnd.NextBytes(buffer);
    float rndfloat = BitConverter.ToSingle(buffer, 0);
    switch(i){
        case 0: { test[i] = float.MaxValue; break; }
        case 1: { test[i] = float.MinValue; break; }
        case 2: { test[i] = float.NaN; break; }
        case 3: { test[i] = float.NegativeInfinity; break; }
        case 4: { test[i] = float.PositiveInfinity; break; }
        case 5: { test[i] = 0f; break; }
        default: { test[i] = test[i] = rndfloat; break; }
    }
}

Und stoppte die Zeit der verschiedenen Sortieralgorithmen:

Stopwatch sw = new Stopwatch();
sw.Start();

float[] sorted1 = test.RadixSort();

sw.Stop();
Console.WriteLine(string.Format("RadixSort: {0}", sw.Elapsed));
sw.Reset();
sw.Start();

float[] sorted2 = test.OrderBy(x => x).ToArray();

sw.Stop();
Console.WriteLine(string.Format("Linq OrderBy: {0}", sw.Elapsed));
sw.Reset();
sw.Start();

Array.Sort(test);
float[] sorted3 = test;

sw.Stop();
Console.WriteLine(string.Format("Array.Sort: {0}", sw.Elapsed));

Und die Ausgabe wurde ( Update: jetzt mit Release-Build läuft, nicht debug ):

RadixSort: 00:00:03.9902332
Linq OrderBy: 00:00:17.4983272
Array.Sort: 00:00:03.1536785

etwa mehr als viermal so schnell wie Linq. Das ist nicht schlecht. Aber nach wie vor noch nicht so schnell wie Array.Sort, aber auch nicht so viel schlechter. Aber ich war wirklich überrascht von diesem: Ich erwartete, etwas langsamer zu sein als Linq auf sehr kleine Arrays. Aber dann lief ich einen Test mit nur 20 Elementen:

RadixSort: 00:00:00.0012944
Linq OrderBy: 00:00:00.0072271
Array.Sort: 00:00:00.0002979

und auch dieses Mal mein RadixSort ist schneller als Linq, aber Weg langsamer als Array sortieren. :)

Update 2:

Ich habe einige weitere Messungen und fand heraus, einige interessante Dinge: mehr Gruppenlänge Konstanten bedeuten weniger Iterationen und Speichernutzung. Wenn Sie eine Gruppe Länge von 16 Bit (nur 2 Iterationen) verwenden, haben Sie einen großen Overhead-Speicher, wenn kleine Arrays Sortierung, aber Sie können Array.Sort zu schlagen, wenn es um Arrays kommt größer als etwa 100k Elemente, wenn auch nicht sehr viel. Die Diagramme Achsen sind beide logarithmiert:

_{(Quelle: daubmeier.de )}

Andere Tipps

Es gibt eine schöne Erklärung, wie Art hier auf Schwimmern auszuführen radix: http://www.codercorner.com/RadixSortRevisited.htm

Wenn alle Werte positiv sind, können Sie mit der Verwendung der binären Darstellung weg; der Link erklärt, wie negative Werte zu handhaben.

Sie können einen unsafe Block verwenden, um MEMCPY oder Alias ??ein float * zu einem uint * die Bits zu extrahieren.

Mit etwas Phantasie Casting tun und Arrays tauschen stattdessen diese Version des Kopierens ist 2x schneller für 10M Zahlen wie Philip Daubmeiers Original mit grouplength Satz 8. Es ist 3x schneller als Array.Sort für die Arraysize.

 static public void RadixSortFloat(this float[] array, int arrayLen = -1)
        {
            // Some use cases have an array that is longer as the filled part which we want to sort
            if (arrayLen < 0) arrayLen = array.Length;
            // Cast our original array as long
            Span<float> asFloat = array;
            Span<int> a = MemoryMarshal.Cast<float, int>(asFloat);
            // Create a temp array
            Span<int> t = new Span<int>(new int[arrayLen]);

            // set the group length to 1, 2, 4, 8 or 16 and see which one is quicker
            int groupLength = 8;
            int bitLength = 32;

            // counting and prefix arrays
            // (dimension is 2^r, the number of possible values of a r-bit number) 
            var dim = 1 << groupLength;
            int groups = bitLength / groupLength;
            if (groups % 2 != 0) throw new Exception("groups must be even so data is in original array at end");
            var count = new int[dim];
            var pref = new int[dim];
            int mask = (dim) - 1;
            int negatives = 0, positives = 0;

            // counting elements of the 1st group incuding negative/positive
            for (int i = 0; i < arrayLen; i++)
            {
                if (a[i] < 0) negatives++;
                count[(a[i] >> 0) & mask]++;
            }
            positives = arrayLen - negatives;

            int c;
            int shift;
            for (c = 0, shift = 0; c < groups - 1; c++, shift += groupLength)
            {
                CalcPrefixes();
                var nextShift = shift + groupLength;
                //
                for (var i = 0; i < arrayLen; i++)
                {
                    var ai = a[i];
                    // Get the right index to sort the number in
                    int index = pref[( ai >> shift) & mask]++;
                    count[( ai>> nextShift) & mask]++;
                    t[index] =  ai;
                }

                // swap the arrays and start again until the last group 
                var temp = a;
                a = t;
                t = temp;
            }

            // Last round
            CalcPrefixes();
            for (var i = 0; i < arrayLen; i++)
            {
                var ai = a[i];
                // Get the right index to sort the number in
                int index = pref[( ai >> shift) & mask]++;
                // We're in the last (most significant) group, if the
                // number is negative, order them inversely in front
                // of the array, pushing positive ones back.
                if ( ai < 0) index = positives - (index - negatives) - 1; else index += negatives;
                //
                t[index] =  ai;
            }

            void CalcPrefixes()
            {
                pref[0] = 0;
                for (int i = 1; i < dim; i++)
                {
                    pref[i] = pref[i - 1] + count[i - 1];
                    count[i - 1] = 0;
                }
            }
        }

Ich denke, die beste Wahl, wenn die Werte nicht zu nahe und es gibt eine vernünftige Genauigkeitsanforderung, kann man einfach die tatsächlichen Schwimmer Ziffern verwendet vor und nach dem Komma der Sortierung zu tun.

Zum Beispiel können Sie die ersten 4 Dezimalstellen verwenden (ob 0 oder nicht) die Sortierung zu tun.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow