不安全的指针迭代和位图 - 为什么UINT64更快?
-
25-10-2019 - |
题
我一直在做一些不安全的位图操作,发现增加指针的次数更少会导致一些巨大的性能改进。我不确定为什么这样做,即使您在循环中进行了更多的位操作,但最好在指针上进行更少的迭代效果还是更好。
因此,例如,用UINT32在两个像素上迭代32位像素超过32位像素,并在一个周期内进行两倍的操作。
以下方法是通过读取两个像素并修改它们来做到这一点(当然,它会以奇数宽度的图像失败,但仅用于测试)。
private void removeBlueWithTwoPixelIteration()
{
// think of a big image with data
Bitmap bmp = new Bitmap(15000, 15000, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
TimeSpan startTime, endTime;
unsafe {
UInt64 doublePixel;
UInt32 pixel1;
UInt32 pixel2;
const int readSize = sizeof(UInt64);
const UInt64 rightHalf = UInt32.MaxValue;
PerformanceCounter pf = new PerformanceCounter("System", "System Up Time"); pf.NextValue();
BitmapData bd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, bmp.PixelFormat);
byte* image = (byte*)bd.Scan0.ToPointer();
startTime = TimeSpan.FromSeconds(pf.NextValue());
for (byte* line = image; line < image + bd.Stride * bd.Height; line += bd.Stride)
{
for (var pointer = line; pointer < line + bd.Stride; pointer += readSize)
{
doublePixel = *((UInt64*)pointer);
pixel1 = (UInt32)(doublePixel >> (readSize * 8 / 2)) >> 8; // loose last 8 bits (Blue color)
pixel2 = (UInt32)(doublePixel & rightHalf) >> 8; // loose last 8 bits (Blue color)
*((UInt32*)pointer) = pixel1 << 8; // putback but shift so A R G get back to original positions
*((UInt32*)pointer + 1) = pixel2 << 8; // putback but shift so A R G get back to original positions
}
}
endTime = TimeSpan.FromSeconds(pf.NextValue());
bmp.UnlockBits(bd);
bmp.Dispose();
}
MessageBox.Show((endTime - startTime).TotalMilliseconds.ToString());
}
以下代码通过像素来像素一样 慢约70% 比上一个:
private void removeBlueWithSinglePixelIteration()
{
// think of a big image with data
Bitmap bmp = new Bitmap(15000, 15000, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
TimeSpan startTime, endTime;
unsafe
{
UInt32 singlePixel;
const int readSize = sizeof(UInt32);
PerformanceCounter pf = new PerformanceCounter("System", "System Up Time"); pf.NextValue();
BitmapData bd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, bmp.PixelFormat);
byte* image = (byte*)bd.Scan0.ToPointer();
startTime = TimeSpan.FromSeconds(pf.NextValue());
for (byte* line = image; line < image + bd.Stride * bd.Height; line += bd.Stride)
{
for (var pointer = line; pointer < line + bd.Stride; pointer += readSize)
{
singlePixel = *((UInt32*)pointer) >> 8; // loose B
*((UInt32*)pointer) = singlePixel << 8; // adjust A R G back
}
}
endTime = TimeSpan.FromSeconds(pf.NextValue());
bmp.UnlockBits(bd);
bmp.Dispose();
}
MessageBox.Show((endTime - startTime).TotalMilliseconds.ToString());
}
有人能否澄清为什么将指针比进行一些位操作更为昂贵?
我正在使用.NET 4框架。
对于C ++来说,这样的事情可以吗?
NB。 32位vs 64位这两种方法的比率相等,但是两种方式在64 vs 32位的速度慢20%?
编辑:正如Porges和Arul所建议的那样,这可能是由于记忆读数和开销的分支数量减少。
edit2:
经过一番测试,似乎从内存读取的时间更少的是答案:
使用此代码假设图像宽度可排除在5中,您可以更快地获得400%:
[StructLayout(LayoutKind.Sequential,Pack = 1)]
struct PixelContainer {
public UInt32 pixel1;
public UInt32 pixel2;
public UInt32 pixel3;
public UInt32 pixel4;
public UInt32 pixel5;
}
然后使用此:
int readSize = sizeof(PixelContainer);
// .....
for (var pointer = line; pointer < line + bd.Stride; pointer += readSize)
{
multiPixel = *((PixelContainer*)pointer);
multiPixel.pixel1 &= 0xFFFFFF00u;
multiPixel.pixel2 &= 0xFFFFFF00u;
multiPixel.pixel3 &= 0xFFFFFF00u;
multiPixel.pixel4 &= 0xFFFFFF00u;
multiPixel.pixel5 &= 0xFFFFFF00u;
*((PixelContainer*)pointer) = multiPixel;
}
解决方案
其他提示
这不是增加较慢的指针,而是从内存中读取的指针。有了32位单元,您的读数是读的两倍。
如果您在64位版本中编写一次而不是两次,则应该再次找到它。
不隶属于 StackOverflow