Introduction
This article explains the implementation of the Sobel Edge Detection method in a pixel shader using C++. And also the implementation of Cartoon effect of an image with edge information.
Input image (TajMahal.bmp)
Screenshot of edge detection output of TajMahal.bmp
Screenshot of Cartoon effect output of TajMahal.bmp
Another screenshot of Cartoon effect
Background
Edge detection is simple image processing, which aims at identifying points in a digital image at which the image brightness changes sharply, or more formally, has discontinuities. Sobel Edge detection uses implementation of the Sobel operator as explained in http://en.wikipedia.org/wiki/Sobel_operator.
After implementing edge detection, I found a simple technique to make the cartoon effect image by combining the input image and its edge image.
Using the code
Initially, we can look at the C++ implementation of the Sobel Edge detection method. Two kernels are applied to each pixel. One kernel finds the color change [gradient] in the X direction, and the other finds the color change [gradient] in the Y direction.
The below section describes how these matrices are applied to a sample image.
This is the sample image, and we can see how the X directional and Y directional gradient calculations work.
Sample image used to explain the X directional and Y directional gradient calculation
X direction gradient calculation
The code below finds out the changes in the X direction. Just find the weighted sum of the surrounding 3 * 3 pixels.
GX[0][0] = -1; GX[0][1] = 0; GX[0][2] = 1;
GX[1][0] = -2; GX[1][1] = 0; GX[1][2] = 2;
GX[2][0] = -1; GX[2][1] = 0; GX[2][2] = 1;
for(I=-1; I<=1; I++)
{
for(J=-1; J<=1; J++)
{
sumX = sumX + (int)( (*(stOriginalImage_i.pData + nX + I +
(nY + J)*stOriginalImage_i.nCols)) * GX[I+1][J+1]);
}
}
Here is the output image of the X directional gradient calculation:
The output of the X directional gradient calculation
Y direction gradient calculation
The code below finds out the changes in the Y direction. Just find the weighted sum of the surrounding 3 * 3 pixels.
GY[0][0] = 1; GY[0][1] = 2; GY[0][2] = 1;
GY[1][0] = 0; GY[1][1] = 0; GY[1][2] = 0;
GY[2][0] = -1; GY[2][1] = -2; GY[2][2] = -1;
for(I=-1; I<=1; I++)
{
for(J=-1; J<=1; J++)
{
sumY = sumY + (int)( (*(stOriginalImage_i.pData + nX + I +
(nY + J)*stOriginalImage_i.nCols)) * GY[I+1][J+1]);
}
}
Here is the output image of the Y directional gradient calculation:
Output image of the Y directional gradient calculation
Final image calculation
Finally, the X and Y directional gradient values are combined with the following equation:
and the code:
SUM = sqrt(double(sumX * sumX) + double(sumY * sumY));
Here is the output:
Output image of the test bitmap
Edge Detection of RGB Image
This logic is used to find out the gradient for a single component image. I used this algorithm for each component of the input image [R, G, B] separately and combined the output together.
BYTE* pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
originalImageR.pData[n] = *pbyTemp++; originalImageG.pData[n] = *pbyTemp++; originalImageB.pData[n] = *pbyTemp++; }
FindEdge( originalImageR, OutputRed );
FindEdge( originalImageG, OutputGreen );
FindEdge( originalImageB, OutputBlue );
pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
*pbyTemp++ = OutputRed.pData[n];
*pbyTemp++ = OutputGreen.pData[n];
*pbyTemp++ = OutputBlue.pData[n];
}
Here is the entire code of the gradient calculation of a component:
void EdgeDetectCPU::FindEdge( ImageInfo_t& stOriginalImage_i,
ImageInfo_t& stEdgeImage_o )
{
int nX, nY,I, J;
long sumX, sumY;
int nColors, SUM;
int GX[3][3];
int GY[3][3];
stEdgeImage_o.pData =
new BYTE[stOriginalImage_i.nCols * stOriginalImage_i.nRows];
GX[0][0] = -1; GX[0][1] = 0; GX[0][2] = 1;
GX[1][0] = -2; GX[1][1] = 0; GX[1][2] = 2;
GX[2][0] = -1; GX[2][1] = 0; GX[2][2] = 1;
GY[0][0] = 1; GY[0][1] = 2; GY[0][2] = 1;
GY[1][0] = 0; GY[1][1] = 0; GY[1][2] = 0;
GY[2][0] = -1; GY[2][1] = -2; GY[2][2] = -1;
for(nY=0; nY<=(stOriginalImage_i.nRows-1); nY++)
{
for(nX=0; nX<=(stOriginalImage_i.nCols-1); nX++)
{
sumX = 0;
sumY = 0;
SUM = 0;
if( !(nX==0 || nX==stOriginalImage_i.nCols-1 ||
nY==0 || nY==stOriginalImage_i.nRows-1))
{
for(I=-1; I<=1; I++)
{
for(J=-1; J<=1; J++)
{
sumX = sumX + (int)( (*(stOriginalImage_i.pData + nX + I +
(nY + J)*stOriginalImage_i.nCols)) * GX[I+1][J+1]);
}
}
for(I=-1; I<=1; I++)
{
for(J=-1; J<=1; J++)
{
sumY = sumY + (int)( (*(stOriginalImage_i.pData + nX + I +
(nY + J)*stOriginalImage_i.nCols)) * GY[I+1][J+1]);
}
}
SUM = sqrt(double(sumX * sumX) + double(sumY * sumY));
}
if(SUM>255) SUM=255;
if(SUM<0) SUM=0;
*(stEdgeImage_o.pData + nX + nY * stOriginalImage_i.nCols) =
255 - (unsigned char)(SUM);
}
}
}
Cartoon Effect Implementation
The edge information of an image is now done. Cartoon effect can be simply created with this edge image and its input data. Just combine the edge image and its corresponding input image. The following figure illustrates the creation of a cartoon effect with the edge image.
Cartoon effect created by combining the edge image and input image
The following code explains the Cartoon effect implementation with the edge image. EdgeDetectCPU::FindEdge
is modified to create a cartoon effect or edge image based on the m_bCartoonEffect
flag.
if(SUM>255) SUM=255;
if(SUM<0) SUM=0;
int nOut = 0;
if( m_bCartoonEffect )
{
nOut = (SUM * 0.5) + (*(stOriginalImage_i.pData + nX +
nY * stOriginalImage_i.nCols) * 0.5);
}
else
{
nOut = 255 - (unsigned char)(SUM);
}
*(stEdgeImage_o.pData + nX + nY * stOriginalImage_i.nCols) = nOut;
Pixel Shader Implementation
I hope this logic is very simple to port to a pixel shader, since the logic of the gradient calculation for a pixel is common function, and that can be applied to each pixel of the image. And the gradient calculation of a pixel does not depend on the output of the other pixel calculations (it is difficult to implement a pixel shader, if one pixel output depends on the output of the other pixel). Therefore, I removed the two for
loops (that are used for the iteration of each pixel in the bitmap) in the pixel shader. The remaining task for creating the pixel shader is to convert some data types to the shader compatible data types. For example, the shader does not support two dimensional arrays, but it provides a matrix data type that can be used similar to a two dimensional array. Here is the X,Y gradient matrix declaration in the shader.
mat3 GX = mat3( -1.0, 0.0, 1.0,
-2.0, 0.0, 2.0,
-1.0, 0.0, 1.0 );
mat3 GY = mat3( 1.0, 2.0, 1.0,
0.0, 0.0, 0.0,
-1.0, -2.0, -1.0 );
The texture coordinates received in the pixel shader are used to find out the nX
, nY
values of the C++ for
loop code.
float fXIndex = gl_TexCoord[0].s * fWidth;
float fYIndex = gl_TexCoord[0].t * fHeight;
And the entire pixel shader for edge detection and Cartoon effect is:
uniform sampler2D ImageTexture;
uniform float fWidth;
uniform float fHeight;
uniform float fCartoonEffect;
void main()
{
mat3 GX = mat3( -1.0, 0.0, 1.0,
-2.0, 0.0, 2.0,
-1.0, 0.0, 1.0 );
mat3 GY = mat3( 1.0, 2.0, 1.0,
0.0, 0.0, 0.0,
-1.0, -2.0, -1.0 );
vec4 fSumX = vec4( 0.0,0.0,0.0,0.0 );
vec4 fSumY = vec4( 0.0,0.0,0.0,0.0 );
vec4 fTotalSum = vec4( 0.0,0.0,0.0,0.0 );
float fXIndex = gl_TexCoord[0].s * fWidth;
float fYIndex = gl_TexCoord[0].t * fHeight;
if( ! ( fYIndex < 1.0 || fYIndex > fHeight - 1.0 ||
fXIndex < 1.0 || fXIndex > fWidth - 1.0 ))
{
for(float I=-1.0; I<=1.0; I = I + 1.0)
{
for(float J=-1.0; J<=1.0; J = J + 1.0)
{
float fTempX = ( fXIndex + I + 0.5 ) / fWidth ;
float fTempY = ( fYIndex + J + 0.5 ) / fHeight ;
vec4 fTempSumX = texture2D( ImageTexture, vec2( fTempX, fTempY ));
fSumX = fSumX + ( fTempSumX * vec4( GX[int(I+1.0)][int(J+1.0)],
GX[int(I+1.0)][int(J+1.0)],
GX[int(I+1.0)][int(J+1.0)],
GX[int(I+1.0)][int(J+1.0)]));
}
}
{ for(float I=-1.0; I<=1.0; I = I + 1.0)
{
for(float J=-1.0; J<=1.0; J = J + 1.0)
{
float fTempX = ( fXIndex + I + 0.5 ) / fWidth ;
float fTempY = ( fYIndex + J + 0.5 ) / fHeight ;
vec4 fTempSumY = texture2D( ImageTexture, vec2( fTempX, fTempY ));
fSumY = fSumY + ( fTempSumY * vec4( GY[int(I+1.0)][int(J+1.0)],
GY[int(I+1.0)][int(J+1.0)],
GY[int(I+1.0)][int(J+1.0)],
GY[int(I+1.0)][int(J+1.0)]));
}
}
vec4 fTem = fSumX * fSumX + fSumY * fSumY;
fTotalSum = sqrt( fTem );
}
}
if( 0.5 < fCartoonEffect )
{
fTotalSum = mix( fTotalSum, texture2D( ImageTexture,
vec2( gl_TexCoord[0].s, gl_TexCoord[0].t)), 0.5);
}
else
{
fTotalSum = vec4( 1.0,1.0,1.0,1.0) - fTotalSum;
}
gl_FragColor = ( fTotalSum );
}
The Cartoon effect or edge image is the output of this shader. This program will be executed for each pixels of the image, and this program will calculate the edge information for each pixel. Then at the final stage, the fCartoonEffect
flag is used to make the output color of the shader. fCartoonEffect
is set from the application to the shader based on the state of the CartoonEffect
check box.
if( 0.5 < fCartoonEffect )
{
fTotalSum = mix( fTotalSum, texture2D( ImageTexture,
vec2( gl_TexCoord[0].s, gl_TexCoord[0].t)), 0.5);
}
else
{
fTotalSum = vec4( 1.0,1.0,1.0,1.0) - fTotalSum;
}
I think more explanation on this pixel shader is not required because we have already discussed the X directional and Y directional gradient (change) calculations in the above C++ logic.
Save Functionality
The Save button of the Edge Detection application can be used to save the output of the Edge Detection algorithm to a BMP file. The Save functionality simply reads the pixels from the screen using the glReadPixels()
API.
Here is the code to read the pixel information from the screen:
pbyData = new BYTE[stImageArea.bottom * stImageArea.right * 3];
if( 0 == pbyData )
{
AfxMessageBox( L"Memory Allocation failed" );
return;
}
glReadPixels( 0, 0, stImageArea.right, stImageArea.bottom,
GL_BGR_EXT, GL_UNSIGNED_BYTE, pbyData );
BMPLoader SaveBmp;
SaveBmp.SaveBMP( csFileName, stImageArea.right, stImageArea.bottom, pbyData );
The file name creation code for saving is a tricky one:
CString csFileName;
csFileName.Format( L"EdgeDetection_%s_%d.bmp",
( RUN_IN_CPU == m_nRunIn ) ? L"CPU" : L"GPU",
( RUN_IN_CPU == m_nRunIn ) ? ++nCPUCount : ++nGPUCount );
CFileDialog SaveDlg( false, L"*.bmp", csFileName );
An example of different names generated in CPU and GPU mode: EdgeDetection_GPU_1.bmp, EdgeDetection_CPU_1.bmp.
Points of Interest
The 4 byte boundary padding of BMP data made some difficulty for edge detection calculation in CPU. The CPU edge detection is specially handled when the number of columns of the bitmap is not a multiple of 4. When edge detection is performed without considering the padding data [BMP file contains some unexpected value in the padding pixels], I get a strange output image, because the buffer provided to EdgeDetectCPU
contains some unwanted data in the padded pixels. Therefore, I followed these steps to avoid the issue:
- Removed unexpected pixels padded at the right end of the image to make the columns a multiple of 4.
- Find out the edge image using
EdgeDetect.EdgeDetect( m_nImageWidth, m_nImageHeight, pbyImage )
. - Added padding pixels (columns are made multiple of 4).
The code below handles this special case:
BYTE* pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
originalImageR.pData[n] = *pbyTemp++; originalImageG.pData[n] = *pbyTemp++; originalImageB.pData[n] = *pbyTemp++; }
FindEdge( originalImageR, OutputRed );
FindEdge( originalImageG, OutputGreen );
FindEdge( originalImageB, OutputBlue );
pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
*pbyTemp++ = OutputRed.pData[n];
*pbyTemp++ = OutputGreen.pData[n];
*pbyTemp++ = OutputBlue.pData[n];
}
BMPLoader::LoadBMP
uses GDI+ to load an image file. It uses the GDIPlus::Bitmap
class to read an image from a file. m_pBitmap->LockBits()
provides the image data.
Gdiplus::Bitmap* m_pBitmap = new Gdiplus::Bitmap(pFileName_i, true);
BYTE* pbyData = 0;
int nWidth = m_pBitmap->GetWidth();
int nHeight = m_pBitmap->GetHeight();
Gdiplus::Rect rect(0,0,nWidth,nHeight);
Gdiplus::BitmapData pBMPData;
m_pBitmap->LockBits( &rect,Gdiplus::ImageLockMode::ImageLockModeRead,
PixelFormat24bppRGB, &pBMPData );
pbyData_o = new BYTE[nWidth * nHeight * 3];
nWidth_o = nWidth;
nHeight_o = nHeight;
if( 0 == pBMPData.Scan0 )
{
return false;
}
BYTE* pSrc = (BYTE*)pBMPData.Scan0;
int nVert = nHeight - 1;
for( int nY = 0; nY < nHeight && nVert > 0; nY++ )
{
BYTE* pDest = pbyData_o + ( nWidth * nVert * 3 );
memcpy( pDest, pSrc, 3 * nWidth);
nVert--;
pSrc += ( nWidth * 3 );
}
m_pBitmap->UnlockBits( &pBMPData );
History
- 19 July 2010 - Initial version.
- 28 July 2010 - Added the
AppError
class to display an error message if the shader creation fails. - 14 August 2010 - Added Cartoon effect functionality.
- 20 August 2010 - Added JPEG, PNG image loading using GDI+.