Pixel modifying code runs quick in main app, really slow in Delphi 6 DirectShow filter with other problems

StackOverflow https://stackoverflow.com/questions/8751210

سؤال

I have a Delphi 6 application that sends bitmaps to a DirectShow DLL in real-time, 25 frames a second. The DirectShow DLL is my code too and is also written in Delphi 6 using the DSPACK DirectShow component suite. I have a simple block of code that goes through each pixel in the bitmap modifying the brightness and contrast of the image, if a certain flag is set, otherwise the bitmap is pushed out the DirectShow DLL unmodified (push source video filter). The code used to be in the main application and then I just moved it into the DirectShow DLL. When it was in the main application it ran fine. I could see the changes in the bitmap as expected. However, now that the code resides in the DirectShow DLL it has the following problems:

  1. When the code block below is active the DirectShow DLL is really slow. I have a quad core i5 and it's really slow. I can also see a big spike in the CPU consumption. In contrast, the very same code running in the main application ran fine on an old single core P4. It did hit the CPU noticeably on that old machine but the video was smooth and there were no problems. The images are only 352 x 288 pixels in size.

  2. I don't see the expected changes to the visible bitmap. I can trace the code in the DirectShow DLL and see the numerical values of each pixel properly altered by the code, but the viewable image in the Graph Edit ActiveMovie window looks completely unchanged.

  3. If I deactivate the code, which I can do in real-time, the ActiveMovie window shows video that is as smooth as glass, perfectly rendered with the CPU barely touched. If I reactivate the code the video is now really choppy, probably showing only 1 to 2 frames a second with a long delay before the first frame is shown, and the CPU spikes. Not completely, but a lot more than I would expect.

I tried compiling the DirectShow DLL with everything on including range checking, overflow checking, etc. and there were no warnings or errors during run-time. I then tried compiling for fastest speed and it still had the exact same problems listed above. Something is really wrong and I can't figure out what. Note, I do indeed lock the canvas before modifying the bitmap and unlock it after I'm done. If it weren't for the "everything on" compilation run I noted above I'd say it felt like an FPU Exception was being raised and silently swallowed with every pixel computation, but as I said, no errors or Exceptions are occurring.

UPDATE: I am putting this here so that the solution, which is embedded in one of Roman R's comment, is plainly visible. The problem that I was not setting the PixelFormat property to pf24Bit before accessing the ScanLine property. As Roman suggested, not doing this must make the TBitmap code create a temporary copy of the bitmap. As soon as I added the line of code below the problems went away, both that of changes not being visible and the soft page faults. It's an insidious problem because the only object that is affected is the pointer you use to access the ScanLine property, since (assumption) it contains a pointer to a temporary copy of the bitmap. That's must be why the subsequent TextOut() call still worked since it worked on the original copy of the bitmap.

clip.PixelFormat := pf24bit; // The missing code line that fixes the problem.

Here's the code block I've been referring to:

function IntToByte(i: Integer): Byte;
begin
 if i > 255 then
   Result := 255
 else if i < 0 then
   Result := 0
 else
   Result := i;
end;

// ---------------------------------------------------------------

procedure brightnessTurboBoost(var clip: TBitmap; rangeExpansionPowerOf2: integer; shiftValue: Byte);
var
   p0: PByte;
   x,y: Integer;
begin
   if (rangeExpansionPowerOf2 = 0) and (shiftValue = 0) then
       exit; // These parameter settings will not change the pixel values.

   for y := 0 to clip.Height-1 do
   begin
       p0 := clip.scanline[y];

       // Can't just do the whole buffer as a big block of bytes since the
       //  individual scan lines may be padded for CPU alignment.
       for x := 0 to (clip.Width - 1) * 3 do
       begin
           if rangeExpansionPowerOf2 >= 1 then
               p0^ := IntToByte((p0^ shl rangeExpansionPowerOf2) + shiftValue)
           else
               p0^ := IntToByte(p0^ + shiftValue);

           Inc(p0);
       end;
   end;
end;
هل كانت مفيدة؟

المحلول

There are a few things to say about this code snippet.

  1. First of all, you are using Scanline property of TBitmap class. I have not been dealign with Delphi for many years, so I might be wrong about this but I am under impression that Scanline is not actually a thin accessor, is it? It might be internally hiding things which can dramatically affect performance, such as "if he wants to access the bits of the image, then we have to first convert it to DIB before returning pointers". So a thing looking so simple might appear to be a killer.

  2. "if rangeExpansionPowerOf2 >= 1 then" in the inner loop body? You don't really want to compare this all the way. Either make two separate functions or duplicate the whole loop without in two version for zero and non-zero rangeExpansionPowerOf2 and do this if only once.

  3. "for ... to (clip.Width - 1) * 3 do" I am not really sure that Delphi optimizes the upper boundary evaluation to make it only once. You might be doing those multiplication thrice for every pixel, while you could do it only once the whole image.

  4. For top perofrmance IntToByte is definitely implemented in MMX to avoid ifs and process multiple bytes at once.

Still as you say that images are only 352x288, I would suspect that #1 is ruining the performance.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top