Introduction
As you probably know, especially if you've found this article through Google :), the ATL CImage
class' pixel access performance is terrible. Despite the fact that it's a fairly popular problem, I have not found the simplest (certainly not the best :) ) solution to the problem anywhere. Based on the Bitmap usage extension library, I've created a simple wrapper class that provides pixel access by directly accessing the bitmap bits. With a minor refactoring, it could be easily separated to also provide pixel access to standalone DIBs.
The way the class is designed is to allow easy optimization or extension of software projects already using the CImage
class, or using CImage
in new projects, without worrying about pixel access performance.
The wrapper class public interface and usage
class CImagePixelAccessOptimizer
{
public:
CImagePixelAccessOptimizer( CImage* _image );
CImagePixelAccessOptimizer( const CImage* _image );
~CImagePixelAccessOptimizer();
COLORREF GetPixel( int _x, int _y ) const;
void SetPixel( int _x, int _y, const COLORREF _color );
};
If you need fast per pixel access in your code, all you have to do is create a temporary stack variable of the CImagePixelAccessOptimizer
class and then change/add calls to the SetPixel
and GetPixel
methods so that they use the temporary optimizer object, and not the CImage
object directly. An example from my turf is a trivial image rotation:
CImagePixelAccessOptimizer tempImageOpt( pTmpImage );
CImagePixelAccessOptimizer currImageOpt( pCurrentImage );
for( unsigned x=0; x < uOrgWidth; ++x )
{
for( unsigned y=0; y < uOrgHeight; ++y )
{
tempImageOpt.SetPixel( uOrgHeight - y - 1, x, currImageOpt.GetPixel( x, y ) );
}
}
It's probably not the fastest way to rotate images, but it works, and shows the point quite well.
Some internals
The class encapsulates simple methods found here and there that let you access pixel information directly from the DIB table(s) based on their native format. The fact of using a temporary class object gives the ability to keep the original code as simple as possible, but at the same time, giving you all the needed areas for optimization. Each operation that's constant between the GetPixel
and SetPixel
calls is performed and remembered in the constructor of the CImagePixelAccessOptimizer
class. Calculating the row width of the DIB table, or obtaining the palette table and image dimensions, is done only once.
Thanks to this, the GetPixel
and SetPixel
methods may be really fast, coming down to just a single switch
statement and a quite simple table indirection or two.
inline COLORREF CImagePixelAccessOptimizer::GetPixel( int _x, int _y ) const
{
ASSERT( PositionOK( _x, _y ) );
FOR_GET_SET_PIXEL_ASSERT( const COLORREF color = m_image->GetPixel( _x, _y ) );
const RGBQUAD* rgbResult = NULL;
RGBQUAD tempRgbResult;
switch( m_bitCnt )
{
case 1:
rgbResult = &m_colors[ *(m_bits + m_rowBytes*_y + _x/8) &
(0x80 >> _x%8) ];
break;
case 4:
rgbResult = &m_colors[ *(m_bits + m_rowBytes*_y + _x/2) &
((_x&1) ? 0x0f : 0xf0) ];
break;
case 8:
rgbResult = &m_colors[ *(m_bits + m_rowBytes*_y + _x) ];
break;
case 16:
{
WORD dummy = *(LPWORD)(m_bits + m_rowBytes*_y + _x*2);
tempRgbResult.rgbBlue = (BYTE)(0x001F & dummy);
tempRgbResult.rgbGreen = (BYTE)(0x001F & (dummy >> 5));
tempRgbResult.rgbRed = (BYTE)(0x001F & dummy >> 10 );
rgbResult = &tempRgbResult;
}
break;
case 24:
rgbResult = (LPRGBQUAD)(m_bits + m_rowBytes*_y + _x*3);
break;
case 32:
rgbResult = (LPRGBQUAD)(m_bits + m_rowBytes*_y + _x*4);
break;
default:
ASSERT( false );
break;
}
const COLORREF rgbResultColorRef = RGB( rgbResult->rgbRed,
rgbResult->rgbGreen, rgbResult->rgbBlue );
GET_SET_PIXEL_ASSERT( rgbResultColorRef == color );
return rgbResultColorRef;
}
Debugging
If you find issues with the code where the colors are set badly or in the wrong places, try un-commenting the below:
#define ENABLE_GET_SET_PIXEL_VERIFICATION
It will enable checks in which the optimized results will be compared with the behavior provided by the CImage
class itself - please report any issues that you find.
The code used for the checks may be seen in the above example. If ENABLE_GET_SET_PIXEL_VERIFICATION
is defined, then GET_SET_PIXEL_ASSERT
becomes a "standard" ASSERT
( ;) ) statement, and FOR_GET_SET_PIXEL_ASSERT
becomes just the enclosed statement. If ENABLE_GET_SET_PIXEL_VERIFICATION
is not defined, then both defines give empty statements. Thanks to this, you can enable additional code and assertions using that code with a single define while keeping the code clean and simple at the same time (no three line #ifdef
s).
By default, CImagePixelAccessOptimizer
does not use the additional verification, as it would bring us back where we started performance wise :).
Success story ;)
I have optimized out practically all pixel access performance issues from my simple image viewing and bad-pixel detecting program called ImageViewer, using this method - from a major usability issue, the pixel access performance became a no issue in a matter of hours - and now, it will be seconds for you. :)
To be true, it's probably not the best idea to use the built-in CImage
class at all, but if you're already there or don't want to install/link some third party libraries into your project, then this simple wrapper located in a single header may be just the thing you need. You get it for free with one exception :) - while running the code from the Bitmap usage extension library, I've found an issue that caused a "memory can't be read" problem - the code copied the whole RGBQUAD
structure from the end of the 24bit DIB table - the reserved member of the RGBQUAD
structure was outside the memory allocated for the DIB. If you find anything like this or images on which the code does not work correctly, please let me know.