¿Hay una buena Ordenamiento Radix-aplicación de flotadores en C #

https://stackoverflow.com/questions/2685035

30-09-2019
|

Pregunta

I tiene una estructura de datos con un campo del tipo flotador. Una colección de estas estructuras tiene que ser resuelto por el valor del flotador. ¿Hay una aplicación radix de clasificación para este.

Si no es así, ¿hay una forma rápida de acceder al exponente, el signo y la mantisa. Porque si ordena los flotadores por primera vez en la mantisa, exponente, y el exponente de la última vez. A clasificar flota en O (n).

Solución

Actualización:

Me estaba bastante interesado en este tema, así que me senté y implementé (usando esta muy rápido y la memoria aplicación conservadora ). También leí éste (gracias celion ) y descubrió que incluso no tiene que dividir los flotadores en mantisa y exponente para solucionar el problema. Sólo tienes que tomar los bits de uno-a-uno y realizar un int tipo. Sólo hay que preocuparse por los valores negativos, que tienen que ser inversamente poner delante de los positivos al final del algoritmo (I hecho de que en un solo paso con la última iteración del algoritmo de ahorrar algo de tiempo de CPU).

Así que aquí está mi flotador Ordenamiento Radix:

public static float[] RadixSort(this float[] array)
{
    // temporary array and the array of converted floats to ints
    int[] t = new int[array.Length];
    int[] a = new int[array.Length];
    for (int i = 0; i < array.Length; i++)
        a[i] = BitConverter.ToInt32(BitConverter.GetBytes(array[i]), 0);

    // set the group length to 1, 2, 4, 8 or 16
    // and see which one is quicker
    int groupLength = 4;
    int bitLength = 32;

    // counting and prefix arrays
    // (dimension is 2^r, the number of possible values of a r-bit number) 
    int[] count = new int[1 << groupLength];
    int[] pref = new int[1 << groupLength];
    int groups = bitLength / groupLength;
    int mask = (1 << groupLength) - 1;
    int negatives = 0, positives = 0;

    for (int c = 0, shift = 0; c < groups; c++, shift += groupLength)
    {
        // reset count array 
        for (int j = 0; j < count.Length; j++)
            count[j] = 0;

        // counting elements of the c-th group 
        for (int i = 0; i < a.Length; i++)
        {
            count[(a[i] >> shift) & mask]++;

            // additionally count all negative 
            // values in first round
            if (c == 0 && a[i] < 0)
                negatives++;
        }
        if (c == 0) positives = a.Length - negatives;

        // calculating prefixes
        pref[0] = 0;
        for (int i = 1; i < count.Length; i++)
            pref[i] = pref[i - 1] + count[i - 1];

        // from a[] to t[] elements ordered by c-th group 
        for (int i = 0; i < a.Length; i++){
            // Get the right index to sort the number in
            int index = pref[(a[i] >> shift) & mask]++;

            if (c == groups - 1)
            {
                // We're in the last (most significant) group, if the
                // number is negative, order them inversely in front
                // of the array, pushing positive ones back.
                if (a[i] < 0)
                    index = positives - (index - negatives) - 1;
                else
                    index += negatives;
            }
            t[index] = a[i];
        }

        // a[]=t[] and start again until the last group 
        t.CopyTo(a, 0);
    }

    // Convert back the ints to the float array
    float[] ret = new float[a.Length];
    for (int i = 0; i < a.Length; i++)
        ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

    return ret;
}

Es ligeramente más lento que un tipo int base, debido a la gran variedad de copiar al principio y al final de la función, donde los flotadores se copian a nivel de bits enteros y espalda. La función de todo, sin embargo, es de nuevo O (n). En cualquier caso, mucho más rápido que la clasificación 3 veces en una fila como usted propuso. No veo mucho espacio para optimizaciones más, pero si alguien hace:. Siento libre para decirme

Para el cambio descendente mostrar esta línea al final:

ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

a esto:

ret[a.Length - i - 1] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0);

medición:

He definido alguna prueba corta, que contiene todos los casos especiales de los flotadores (NaN, +/- Inf, el valor mínimo / máximo, 0) y números aleatorios. Se ordena exactamente el mismo orden que Linq o Array.Sort ordena los flotadores:

NaN -> -Inf -> Min -> Negative Nums -> 0 -> Positive Nums -> Max -> +Inf

Así que me encontré con una prueba con una enorme variedad de números 10M:

float[] test = new float[10000000];
Random rnd = new Random();
for (int i = 0; i < test.Length; i++)
{
    byte[] buffer = new byte[4];
    rnd.NextBytes(buffer);
    float rndfloat = BitConverter.ToSingle(buffer, 0);
    switch(i){
        case 0: { test[i] = float.MaxValue; break; }
        case 1: { test[i] = float.MinValue; break; }
        case 2: { test[i] = float.NaN; break; }
        case 3: { test[i] = float.NegativeInfinity; break; }
        case 4: { test[i] = float.PositiveInfinity; break; }
        case 5: { test[i] = 0f; break; }
        default: { test[i] = test[i] = rndfloat; break; }
    }
}

Y se detuvo el tiempo de los diferentes algoritmos de ordenación:

Stopwatch sw = new Stopwatch();
sw.Start();

float[] sorted1 = test.RadixSort();

sw.Stop();
Console.WriteLine(string.Format("RadixSort: {0}", sw.Elapsed));
sw.Reset();
sw.Start();

float[] sorted2 = test.OrderBy(x => x).ToArray();

sw.Stop();
Console.WriteLine(string.Format("Linq OrderBy: {0}", sw.Elapsed));
sw.Reset();
sw.Start();

Array.Sort(test);
float[] sorted3 = test;

sw.Stop();
Console.WriteLine(string.Format("Array.Sort: {0}", sw.Elapsed));

Y la salida fue ( Actualización: ahora corrió con la liberación de construcción, no de depuración ):

RadixSort: 00:00:03.9902332
Linq OrderBy: 00:00:17.4983272
Array.Sort: 00:00:03.1536785

más o menos más de cuatro veces más rápido que LINQ. Eso no es malo. Pero aún así todavía no es tan rápido como Array.Sort, sino también no mucho peor. Pero me quedé muy sorprendido por éste: Yo esperaba que fuera un poco más lento que LINQ en muy pequeñas matrices. Pero entonces me encontré con una prueba con sólo 20 elementos:

RadixSort: 00:00:00.0012944
Linq OrderBy: 00:00:00.0072271
Array.Sort: 00:00:00.0002979

e incluso esta vez mi Ordenamiento Radix es más rápido que LINQ, pero manera más lento que el conjunto de clasificación. :)

Actualización 2:

Me hizo algunas mediciones más y descubrió algunas cosas interesantes: las constantes de longitud grupo más largos significan menos iteraciones y más uso de la memoria. Si utiliza una longitud grupo de 16 bits (sólo 2 iteraciones), que tiene una enorme sobrecarga de memoria al ordenar arreglos pequeños, pero que puede vencer Array.Sort si se trata de matrices mayores de aproximadamente 100 mil elementos, aunque no mucho. Los ejes gráficos son tanto logarithmized:

_{(fuente: daubmeier.de )}

Otros consejos

Hay una buena explicación de cómo realizar Radix sort en los flotadores aquí: http://www.codercorner.com/RadixSortRevisited.htm

Si todos los valores son positivos, usted puede conseguir lejos con el uso de la representación binaria; el enlace se explica cómo tratar los valores negativos.

Se puede usar un bloque de unsafe a memcpy o alias un float * a un uint * para extraer los bits.

Haciendo un poco de fundición de lujo y el intercambio de matrices en lugar de copiar esta versión es 2 veces más rápido para los números 10M como Philip Daubmeiers original con conjunto grouplength a 8. Es 3 veces más rápido que Array.Sort para que arraysize.

 static public void RadixSortFloat(this float[] array, int arrayLen = -1)
        {
            // Some use cases have an array that is longer as the filled part which we want to sort
            if (arrayLen < 0) arrayLen = array.Length;
            // Cast our original array as long
            Span<float> asFloat = array;
            Span<int> a = MemoryMarshal.Cast<float, int>(asFloat);
            // Create a temp array
            Span<int> t = new Span<int>(new int[arrayLen]);

            // set the group length to 1, 2, 4, 8 or 16 and see which one is quicker
            int groupLength = 8;
            int bitLength = 32;

            // counting and prefix arrays
            // (dimension is 2^r, the number of possible values of a r-bit number) 
            var dim = 1 << groupLength;
            int groups = bitLength / groupLength;
            if (groups % 2 != 0) throw new Exception("groups must be even so data is in original array at end");
            var count = new int[dim];
            var pref = new int[dim];
            int mask = (dim) - 1;
            int negatives = 0, positives = 0;

            // counting elements of the 1st group incuding negative/positive
            for (int i = 0; i < arrayLen; i++)
            {
                if (a[i] < 0) negatives++;
                count[(a[i] >> 0) & mask]++;
            }
            positives = arrayLen - negatives;

            int c;
            int shift;
            for (c = 0, shift = 0; c < groups - 1; c++, shift += groupLength)
            {
                CalcPrefixes();
                var nextShift = shift + groupLength;
                //
                for (var i = 0; i < arrayLen; i++)
                {
                    var ai = a[i];
                    // Get the right index to sort the number in
                    int index = pref[( ai >> shift) & mask]++;
                    count[( ai>> nextShift) & mask]++;
                    t[index] =  ai;
                }

                // swap the arrays and start again until the last group 
                var temp = a;
                a = t;
                t = temp;
            }

            // Last round
            CalcPrefixes();
            for (var i = 0; i < arrayLen; i++)
            {
                var ai = a[i];
                // Get the right index to sort the number in
                int index = pref[( ai >> shift) & mask]++;
                // We're in the last (most significant) group, if the
                // number is negative, order them inversely in front
                // of the array, pushing positive ones back.
                if ( ai < 0) index = positives - (index - negatives) - 1; else index += negatives;
                //
                t[index] =  ai;
            }

            void CalcPrefixes()
            {
                pref[0] = 0;
                for (int i = 1; i < dim; i++)
                {
                    pref[i] = pref[i - 1] + count[i - 1];
                    count[i - 1] = 0;
                }
            }
        }

Creo que la mejor opción si los valores no son demasiado cerca y hay un requisito de precisión razonable, usted puede simplemente utilizar los dígitos flotador reales antes y después del punto decimal para realizar la ordenación.

Por ejemplo, se puede utilizar sólo los primeros 4 decimales (ya sean 0 o no) para hacer la clasificación.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow