Iteración de puntero inseguro y mapa de bits - ¿Por qué es UINT64 más rápido?

https://stackoverflow.com/questions/5812990

25-10-2019
|

Pregunta

He estado haciendo algunas operaciones de mapa de bits inseguros y he descubierto que aumentar el puntero menos veces puede conducir a algunas grandes mejoras de rendimiento. No estoy seguro de por qué está, aunque, aunque realice muchas más operaciones en el bucle, todavía es mejor hacer menos iteraciones en el puntero.

Entonces, por ejemplo, en lugar de iterar más de 32 píxeles de bits con un UINT32 iterar más de dos píxeles con UINT64 y hacer el doble de operaciones en un ciclo.

Lo siguiente lo hace leyendo dos píxeles y modificándolos (por supuesto, fallará con imágenes con ancho impar, pero es solo para probar).

    private void removeBlueWithTwoPixelIteration()
    {
        // think of a big image with data
        Bitmap bmp = new Bitmap(15000, 15000, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
        TimeSpan startTime, endTime;

        unsafe {

            UInt64 doublePixel;
            UInt32 pixel1;
            UInt32 pixel2;

            const int readSize = sizeof(UInt64);
            const UInt64 rightHalf = UInt32.MaxValue;

            PerformanceCounter pf = new PerformanceCounter("System", "System Up Time"); pf.NextValue();

            BitmapData bd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, bmp.PixelFormat);
            byte* image = (byte*)bd.Scan0.ToPointer();

            startTime = TimeSpan.FromSeconds(pf.NextValue());

            for (byte* line = image; line < image + bd.Stride * bd.Height; line += bd.Stride)
            {
                for (var pointer = line; pointer < line + bd.Stride; pointer += readSize)
                {
                    doublePixel = *((UInt64*)pointer);
                    pixel1 = (UInt32)(doublePixel >> (readSize * 8 / 2)) >> 8; // loose last 8 bits (Blue color)
                    pixel2 = (UInt32)(doublePixel & rightHalf) >> 8; // loose last 8 bits (Blue color)
                    *((UInt32*)pointer) = pixel1 << 8; // putback but shift so A R G get back to original positions
                    *((UInt32*)pointer + 1) = pixel2 << 8; // putback but shift so A R G get back to original positions
                }
            }

            endTime = TimeSpan.FromSeconds(pf.NextValue());

            bmp.UnlockBits(bd);
            bmp.Dispose();

        }

        MessageBox.Show((endTime - startTime).TotalMilliseconds.ToString());

    }

El siguiente código lo hace píxel por píxel y es alrededor del 70% más lento que el anterior:

    private void removeBlueWithSinglePixelIteration()
    {
        // think of a big image with data
        Bitmap bmp = new Bitmap(15000, 15000, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
        TimeSpan startTime, endTime;

        unsafe
        {

            UInt32 singlePixel;

            const int readSize = sizeof(UInt32);

            PerformanceCounter pf = new PerformanceCounter("System", "System Up Time"); pf.NextValue();

            BitmapData bd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, bmp.PixelFormat);
            byte* image = (byte*)bd.Scan0.ToPointer();

            startTime = TimeSpan.FromSeconds(pf.NextValue());

            for (byte* line = image; line < image + bd.Stride * bd.Height; line += bd.Stride)
            {
                for (var pointer = line; pointer < line + bd.Stride; pointer += readSize)
                {
                    singlePixel = *((UInt32*)pointer) >> 8; // loose B
                    *((UInt32*)pointer) = singlePixel << 8; // adjust A R G back
                }
            }

            endTime = TimeSpan.FromSeconds(pf.NextValue());

            bmp.UnlockBits(bd);
            bmp.Dispose();

        }

        MessageBox.Show((endTime - startTime).TotalMilliseconds.ToString());
    }

¿Podría alguien aclarar por qué incrementar el puntero es una operación más costosa que hacer algunas operaciones bit a bit?

Estoy usando .NET 4 Framework.

¿Podría algo como esto ser cierto para C ++?

NÓTESE BIEN. 32 bit vs 64 bit ¿La relación de los dos métodos es igual, sin embargo, en ambas formas son como un 20% más lento en 64 vs 32 bits?

EDITAR: Según lo sugerido por Porges y Arul, esto podría deberse a una disminución del número de lecturas de memoria y una sobrecarga de ramificación.

Edit2:

Después de algunas pruebas, parece que leer de memoria menos tiempo es la respuesta:

Con este código suponiendo que el ancho de la imagen sea divisible por 5, obtienes un 400% más rápido:

[StructLayout(LayoutKind.Sequential,Pack = 1)]
struct PixelContainer {
    public UInt32 pixel1;
    public UInt32 pixel2;
    public UInt32 pixel3;
    public UInt32 pixel4;
    public UInt32 pixel5;
}

Entonces usa esto:

            int readSize = sizeof(PixelContainer);

            // .....

            for (var pointer = line; pointer < line + bd.Stride; pointer += readSize)
            {
                multiPixel = *((PixelContainer*)pointer);
                multiPixel.pixel1 &= 0xFFFFFF00u;
                multiPixel.pixel2 &= 0xFFFFFF00u;
                multiPixel.pixel3 &= 0xFFFFFF00u;
                multiPixel.pixel4 &= 0xFFFFFF00u;
                multiPixel.pixel5 &= 0xFFFFFF00u;
                *((PixelContainer*)pointer) = multiPixel;
            }

Solución

Esta es una técnica conocida como Desenrollo de bucle. El principal beneficio de rendimiento debe provenir de reducir la sobrecarga de ramificación.

Como nota al margen, puede acelerarlo un poco usando una masilla de bits:

*((UInt64 *)pointer) &= 0xFFFFFF00FFFFFF00ul;

Otros consejos

No es el incremento del puntero que es más lento, sino que lee de memoria. Con unidades de 32 bits, estás haciendo el doble de lecturas.

Debería encontrarlo nuevamente más rápido si escribe una vez en lugar de dos veces en la versión de 64 bits.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow