Pixel Shader for Edge Detection and Cartoon Effect

Santhosh G_

4.88/5 (59 votes)

19 Aug 2010CPOL5 min read

163K

8.7K

Implementation of Sobel Edge Detection and Cartoon Effect using pixel shader.

Introduction

This article explains the implementation of the Sobel Edge Detection method in a pixel shader using C++. And also the implementation of Cartoon effect of an image with edge information.

Input image (TajMahal.bmp)

Screenshot of edge detection output of TajMahal.bmp

Screenshot of Cartoon effect output of TajMahal.bmp

Another screenshot of Cartoon effect

Background

Edge detection is simple image processing, which aims at identifying points in a digital image at which the image brightness changes sharply, or more formally, has discontinuities. Sobel Edge detection uses implementation of the Sobel operator as explained in http://en.wikipedia.org/wiki/Sobel_operator.

After implementing edge detection, I found a simple technique to make the cartoon effect image by combining the input image and its edge image.

Using the code

Initially, we can look at the C++ implementation of the Sobel Edge detection method. Two kernels are applied to each pixel. One kernel finds the color change [gradient] in the X direction, and the other finds the color change [gradient] in the Y direction.

The below section describes how these matrices are applied to a sample image.

This is the sample image, and we can see how the X directional and Y directional gradient calculations work.

Sample image used to explain the X directional and Y directional gradient calculation

X direction gradient calculation

The code below finds out the changes in the X direction. Just find the weighted sum of the surrounding 3 * 3 pixels.

C++

// Initializing X direction gradient kernel.
GX[0][0] = -1; GX[0][1] = 0; GX[0][2] = 1;
GX[1][0] = -2; GX[1][1] = 0; GX[1][2] = 2;
GX[2][0] = -1; GX[2][1] = 0; GX[2][2] = 1;

// Looping to findout change in X diretion
for(I=-1; I<=1; I++)
{
    for(J=-1; J<=1; J++)
    {
        sumX = sumX + (int)( (*(stOriginalImage_i.pData + nX + I +
            (nY + J)*stOriginalImage_i.nCols)) * GX[I+1][J+1]);
    }
}

Here is the output image of the X directional gradient calculation:

The output of the X directional gradient calculation

Y direction gradient calculation

The code below finds out the changes in the Y direction. Just find the weighted sum of the surrounding 3 * 3 pixels.

C++

// Initializing Y direction gradient kernel.
GY[0][0] =  1; GY[0][1] =  2; GY[0][2] =  1;
GY[1][0] =  0; GY[1][1] =  0; GY[1][2] =  0;
GY[2][0] = -1; GY[2][1] = -2; GY[2][2] = -1;
// Looping to findout change in Y diretion
for(I=-1; I<=1; I++)
{
    for(J=-1; J<=1; J++)
    {
        sumY = sumY + (int)( (*(stOriginalImage_i.pData + nX + I + 
            (nY + J)*stOriginalImage_i.nCols)) * GY[I+1][J+1]);
    }
}

Here is the output image of the Y directional gradient calculation:

Output image of the Y directional gradient calculation

Final image calculation

Finally, the X and Y directional gradient values are combined with the following equation:

and the code:

C++

SUM =  sqrt(double(sumX * sumX) + double(sumY * sumY));

Here is the output:

Output image of the test bitmap

Edge Detection of RGB Image

This logic is used to find out the gradient for a single component image. I used this algorithm for each component of the input image [R, G, B] separately and combined the output together.

C++

// Extract each component[R,G,B] to separate buffer to find gradient in 
// each component.
BYTE* pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    originalImageR.pData[n] = *pbyTemp++; // Blue
    originalImageG.pData[n] = *pbyTemp++; // Green
    originalImageB.pData[n] = *pbyTemp++; // Red
}

// Find Gradient of each component separately.
FindEdge( originalImageR, OutputRed );
FindEdge( originalImageG, OutputGreen );
FindEdge( originalImageB, OutputBlue );

// Combine RGB gradient information to a output buffer.
pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    *pbyTemp++ = OutputRed.pData[n];
    *pbyTemp++ = OutputGreen.pData[n];
    *pbyTemp++ = OutputBlue.pData[n];
}

Here is the entire code of the gradient calculation of a component:

C++

void EdgeDetectCPU::FindEdge( ImageInfo_t& stOriginalImage_i, 
                              ImageInfo_t& stEdgeImage_o )
{
    int        nX, nY,I, J;
    long            sumX, sumY;
    int            nColors, SUM;
    int            GX[3][3];
    int            GY[3][3];
    // Allocate output buffer
    stEdgeImage_o.pData = 
      new BYTE[stOriginalImage_i.nCols * stOriginalImage_i.nRows];

    // X Directional Gradient matrix.
    GX[0][0] = -1; GX[0][1] = 0; GX[0][2] = 1;
    GX[1][0] = -2; GX[1][1] = 0; GX[1][2] = 2;
    GX[2][0] = -1; GX[2][1] = 0; GX[2][2] = 1;

    // Y Directional Gradient matrix.
    GY[0][0] =  1; GY[0][1] =  2; GY[0][2] =  1;
    GY[1][0] =  0; GY[1][1] =  0; GY[1][2] =  0;
    GY[2][0] = -1; GY[2][1] = -2; GY[2][2] = -1;

    // Iterate each pixels in the image.
    for(nY=0; nY<=(stOriginalImage_i.nRows-1); nY++)
    {
        for(nX=0; nX<=(stOriginalImage_i.nCols-1); nX++)
        {
            sumX = 0;
            sumY = 0;

            SUM = 0;
            // Skip top,bottom, left and right pixels.
            if( !(nX==0 || nX==stOriginalImage_i.nCols-1 || 
                  nY==0 || nY==stOriginalImage_i.nRows-1))
            {
                // Looping to findout change in X direction
                for(I=-1; I<=1; I++)
                {
                    for(J=-1; J<=1; J++)
                    {
                        sumX = sumX + (int)( (*(stOriginalImage_i.pData + nX + I + 
                            (nY + J)*stOriginalImage_i.nCols)) * GX[I+1][J+1]);
                    }
                }

                // Looping to find out change in Y direction
                for(I=-1; I<=1; I++)
                {
                    for(J=-1; J<=1; J++)
                    {
                        sumY = sumY + (int)( (*(stOriginalImage_i.pData + nX + I + 
                            (nY + J)*stOriginalImage_i.nCols)) * GY[I+1][J+1]);
                    }
                }
                SUM =  sqrt(double(sumX * sumX) + double(sumY * sumY));
            }

            if(SUM>255) SUM=255;
            if(SUM<0) SUM=0;
            *(stEdgeImage_o.pData + nX + nY * stOriginalImage_i.nCols) = 
                                    255 - (unsigned char)(SUM);
        }
    }
}

Cartoon Effect Implementation

The edge information of an image is now done. Cartoon effect can be simply created with this edge image and its input data. Just combine the edge image and its corresponding input image. The following figure illustrates the creation of a cartoon effect with the edge image.

Cartoon effect created by combining the edge image and input image

The following code explains the Cartoon effect implementation with the edge image. EdgeDetectCPU::FindEdge is modified to create a cartoon effect or edge image based on the m_bCartoonEffect flag.

C++

if(SUM>255) SUM=255;
if(SUM<0) SUM=0;
int nOut = 0;
// Checking Cartoon Effect flag to create final image.
if( m_bCartoonEffect )
{
    // Make Cartoon effect by combining edge information and original image.
    nOut = (SUM * 0.5) + (*(stOriginalImage_i.pData + nX + 
            nY * stOriginalImage_i.nCols) * 0.5);
}
else
{
    // Creating displayable edge data.
    nOut = 255 - (unsigned char)(SUM);
}
*(stEdgeImage_o.pData + nX + nY * stOriginalImage_i.nCols) = nOut;

Pixel Shader Implementation

I hope this logic is very simple to port to a pixel shader, since the logic of the gradient calculation for a pixel is common function, and that can be applied to each pixel of the image. And the gradient calculation of a pixel does not depend on the output of the other pixel calculations (it is difficult to implement a pixel shader, if one pixel output depends on the output of the other pixel). Therefore, I removed the two for loops (that are used for the iteration of each pixel in the bitmap) in the pixel shader. The remaining task for creating the pixel shader is to convert some data types to the shader compatible data types. For example, the shader does not support two dimensional arrays, but it provides a matrix data type that can be used similar to a two dimensional array. Here is the X,Y gradient matrix declaration in the shader.

C++

// X directional search matrix.
mat3 GX = mat3( -1.0, 0.0, 1.0,
               -2.0, 0.0, 2.0,
               -1.0, 0.0, 1.0 );
// Y directional search matrix.
mat3 GY =  mat3( 1.0,  2.0,  1.0,
                0.0,  0.0,  0.0,
                -1.0, -2.0, -1.0 );

The texture coordinates received in the pixel shader are used to find out the nX, nY values of the C++ for loop code.

C++

// Findout X , Y index of incoming pixel from its texture coordinate.
float fXIndex = gl_TexCoord[0].s * fWidth;
float fYIndex = gl_TexCoord[0].t * fHeight;

And the entire pixel shader for edge detection and Cartoon effect is:

C++

// Image texture.
uniform sampler2D ImageTexture;

// Width of Image.
uniform float fWidth;
// Height of Image.
uniform float fHeight;
// Indicating cartoon effect is enabled or not.
uniform float fCartoonEffect;

void main()
{
    // X directional search matrix.
    mat3 GX = mat3( -1.0, 0.0, 1.0,
                    -2.0, 0.0, 2.0,
                    -1.0, 0.0, 1.0 );
    // Y directional search matrix.
    mat3 GY =  mat3( 1.0,  2.0,  1.0,
                     0.0,  0.0,  0.0,
                    -1.0, -2.0, -1.0 );

    vec4  fSumX = vec4( 0.0,0.0,0.0,0.0 );
    vec4  fSumY = vec4( 0.0,0.0,0.0,0.0 );
    vec4 fTotalSum = vec4( 0.0,0.0,0.0,0.0 );

    // Findout X , Y index of incoming pixel
    // from its texture coordinate.
    float fXIndex = gl_TexCoord[0].s * fWidth;
    float fYIndex = gl_TexCoord[0].t * fHeight;

    /* image boundaries Top, Bottom, Left, Right pixels*/
    if( ! ( fYIndex < 1.0 || fYIndex > fHeight - 1.0 || 
            fXIndex < 1.0 || fXIndex > fWidth - 1.0 ))
    {
        // X Directional Gradient calculation.
        for(float I=-1.0; I<=1.0; I = I + 1.0)
        {
            for(float J=-1.0; J<=1.0; J = J + 1.0)
            {
                float fTempX = ( fXIndex + I + 0.5 ) / fWidth ;
                float fTempY = ( fYIndex + J + 0.5 ) / fHeight ;
                vec4 fTempSumX = texture2D( ImageTexture, vec2( fTempX, fTempY ));
                fSumX = fSumX + ( fTempSumX * vec4( GX[int(I+1.0)][int(J+1.0)],
                                                    GX[int(I+1.0)][int(J+1.0)],
                                                    GX[int(I+1.0)][int(J+1.0)],
                                                    GX[int(I+1.0)][int(J+1.0)]));
            }
        }

        { // Y Directional Gradient calculation.
            for(float I=-1.0; I<=1.0; I = I + 1.0)
            {
                for(float J=-1.0; J<=1.0; J = J + 1.0)
                {
                    float fTempX = ( fXIndex + I + 0.5 ) / fWidth ;
                    float fTempY = ( fYIndex + J + 0.5 ) / fHeight ;
                    vec4 fTempSumY = texture2D( ImageTexture, vec2( fTempX, fTempY ));
                    fSumY = fSumY + ( fTempSumY * vec4( GY[int(I+1.0)][int(J+1.0)],
                                                        GY[int(I+1.0)][int(J+1.0)],
                                                        GY[int(I+1.0)][int(J+1.0)],
                                                        GY[int(I+1.0)][int(J+1.0)]));
                }
            }
            // Combine X Directional and Y Directional Gradient.
            vec4 fTem = fSumX * fSumX + fSumY * fSumY;
            fTotalSum = sqrt( fTem );
        }
    }
    // Checking status of cartoon effect.
    if( 0.5 < fCartoonEffect )
    {
        // Creaing cartoon effect by combining
        // edge informatioon and original image data.
        fTotalSum = mix( fTotalSum, texture2D( ImageTexture, 
                         vec2( gl_TexCoord[0].s, gl_TexCoord[0].t)), 0.5);
    }
    else
    {
        // Creating displayable edge data.
        fTotalSum = vec4( 1.0,1.0,1.0,1.0) - fTotalSum;
    }
    
    gl_FragColor = ( fTotalSum );
}

The Cartoon effect or edge image is the output of this shader. This program will be executed for each pixels of the image, and this program will calculate the edge information for each pixel. Then at the final stage, the fCartoonEffect flag is used to make the output color of the shader. fCartoonEffect is set from the application to the shader based on the state of the CartoonEffect check box.

C++

// Checking status of cartoon effect.
if( 0.5 < fCartoonEffect )
{
    // Creaing cartoon effect by combining edge
    // informatioon and original image data.
    fTotalSum = mix( fTotalSum, texture2D( ImageTexture, 
                     vec2( gl_TexCoord[0].s, gl_TexCoord[0].t)), 0.5);
}
else
{
    // Creating displayable edge data.
    fTotalSum = vec4( 1.0,1.0,1.0,1.0) - fTotalSum;
}

I think more explanation on this pixel shader is not required because we have already discussed the X directional and Y directional gradient (change) calculations in the above C++ logic.

Save Functionality

The Save button of the Edge Detection application can be used to save the output of the Edge Detection algorithm to a BMP file. The Save functionality simply reads the pixels from the screen using the glReadPixels() API.

Here is the code to read the pixel information from the screen:

C++

pbyData = new BYTE[stImageArea.bottom * stImageArea.right  * 3];
if( 0 == pbyData )
{
    AfxMessageBox( L"Memory Allocation failed" );
    return;
}
glReadPixels( 0, 0, stImageArea.right, stImageArea.bottom, 
              GL_BGR_EXT, GL_UNSIGNED_BYTE, pbyData );
BMPLoader SaveBmp;
SaveBmp.SaveBMP( csFileName, stImageArea.right, stImageArea.bottom, pbyData );

The file name creation code for saving is a tricky one:

C++

CString csFileName;
// This one create different names in CPU mode and GPU mode.
csFileName.Format( L"EdgeDetection_%s_%d.bmp", 
                 ( RUN_IN_CPU == m_nRunIn ) ? L"CPU" : L"GPU",
                 ( RUN_IN_CPU == m_nRunIn ) ? ++nCPUCount : ++nGPUCount );
CFileDialog SaveDlg( false, L"*.bmp", csFileName );

An example of different names generated in CPU and GPU mode: EdgeDetection_GPU_1.bmp, EdgeDetection_CPU_1.bmp.

Points of Interest

The 4 byte boundary padding of BMP data made some difficulty for edge detection calculation in CPU. The CPU edge detection is specially handled when the number of columns of the bitmap is not a multiple of 4. When edge detection is performed without considering the padding data [BMP file contains some unexpected value in the padding pixels], I get a strange output image, because the buffer provided to EdgeDetectCPU contains some unwanted data in the padded pixels. Therefore, I followed these steps to avoid the issue:

Removed unexpected pixels padded at the right end of the image to make the columns a multiple of 4.
Find out the edge image using EdgeDetect.EdgeDetect( m_nImageWidth, m_nImageHeight, pbyImage ).
Added padding pixels (columns are made multiple of 4).

The code below handles this special case:

C++

// Extract each component[R,G,B] to separate buffer to find gradient in 
// each component.
BYTE* pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    originalImageR.pData[n] = *pbyTemp++; // Blue
    originalImageG.pData[n] = *pbyTemp++; // Green
    originalImageB.pData[n] = *pbyTemp++; // Red
}

// Find Gradient of each component separately.
FindEdge( originalImageR, OutputRed );
FindEdge( originalImageG, OutputGreen );
FindEdge( originalImageB, OutputBlue );

// Combine RGB gradient information to a output buffer.
pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    *pbyTemp++ = OutputRed.pData[n];
    *pbyTemp++ = OutputGreen.pData[n];
    *pbyTemp++ = OutputBlue.pData[n];
}

BMPLoader::LoadBMP uses GDI+ to load an image file. It uses the GDIPlus::Bitmap class to read an image from a file. m_pBitmap->LockBits() provides the image data.

C++

// Code for reading Image file from file using GDI+
Gdiplus::Bitmap* m_pBitmap = new Gdiplus::Bitmap(pFileName_i, true);

BYTE* pbyData = 0;
int nWidth = m_pBitmap->GetWidth();
int nHeight = m_pBitmap->GetHeight();
Gdiplus::Rect rect(0,0,nWidth,nHeight);
Gdiplus::BitmapData pBMPData;
m_pBitmap->LockBits( &rect,Gdiplus::ImageLockMode::ImageLockModeRead, 
                        PixelFormat24bppRGB, &pBMPData );
pbyData_o = new BYTE[nWidth * nHeight * 3];
nWidth_o = nWidth;
nHeight_o = nHeight;
if( 0 == pBMPData.Scan0 )
{
    return false;
}
BYTE* pSrc = (BYTE*)pBMPData.Scan0;
int nVert = nHeight - 1;
for( int nY = 0; nY < nHeight && nVert > 0; nY++ )
{
    // Avoid top and bottom difference.
    BYTE* pDest = pbyData_o + ( nWidth * nVert * 3 );
    memcpy( pDest, pSrc, 3 * nWidth);
    nVert--;
    pSrc += ( nWidth * 3 );
}
m_pBitmap->UnlockBits( &pBMPData );

History

19 July 2010 - Initial version.
28 July 2010 - Added the AppError class to display an error message if the shader creation fails.
14 August 2010 - Added Cartoon effect functionality.
20 August 2010 - Added JPEG, PNG image loading using GDI+.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)