Introduction
Image convolution plays an important role in computer graphics applications. Convolution of images allows for image alterations that enable the creation of new images from old ones. Convolution also allows for important features such as edge detection, with many widespread uses. The convolution of an image is a simple process by which the pixel of an image is multiplied by a kernel, or masked, to create a new pixel value. Convolution is commonly referred to as filtering.
Details
First, for a given pixel (x,y), we give a weight to each of the surrounding pixels. It may be thought of as giving a number telling how important the pixel is to us. The number may be any integer or floating point number, though I usually stick to floating point since floats will accept integers as well. The kernel or mask that contains the filter may actually be any size (3x3, 5x5, 7x7); however, 3x3 is very common. Since the process for each is the same, I will concentrate only on the 3x3 kernels.
Second, the actual process of convolution involves getting each pixel near a given pixel (x,y), and multiplying each of the pixel's channels by the weighted kernel value. This means that for a 3x3 kernel, we would multiply the pixels like so:
(x-1,y-1) * kernel_value[row0][col0]
(x ,y-1) * kernel_value[row0][col1]
(x+1,y-1) * kernel_value[row0][col2]
(x-1,y ) * kernel_value[row1][col0]
(x ,y ) * kernel_value[row1][col1]
(x+1,y ) * kernel_value[row1][col2]
(x-1,y+1) * kernel_value[row2][col0]
(x ,y+1) * kernel_value[row2][col1]
(x+1,y+1) * kernel_value[row2][col2]
The process is repeated for each channel of the image. This means that the red, green, and blue color channels (if working in RGB color space) must each be multiplied by the kernel values. The kernel position is related to the pixel position it is multiplied by. Simply put, the kernel is allocated in kernel[rows][cols]
, which would be kernel[3][3]
in this case. The 3x3 (5x5 or 7x7, if using a larger kernel) area around the pixel (x,y) is then multiplied by the kernel to get the total sum. If we were working with a 100x100 image, allocated as image[100][100]
, and we wanted the value for pixel (10,10), the process for each channel would look like:
float fTotalSum =
Pixel(10-1,10-1) * kernel_value[row0][col0] +
Pixel(10 ,10-1) * kernel_value[row0][col1] +
Pixel(10+1,10-1) * kernel_value[row0][col2] +
Pixel(10-1,10 ) * kernel_value[row1][col0] +
Pixel(10 ,10 ) * kernel_value[row1][col1] +
Pixel(10+1,10 ) * kernel_value[row1][col2] +
Pixel(10-1,10+1) * kernel_value[row2][col0] +
Pixel(10 ,10+1) * kernel_value[row2][col1] +
Pixel(10+1,10+1) * kernel_value[row2][col2] +
Finally, each value is added to the total sum, which is then divided by the total weight of the kernel. The kernel's weight is given by adding each value contained in the kernel. If the value is zero or less, then a weight of 1 is given to avoid a division by zero.
The actual code to convolve an image is:
for (int i=0; i <= 2; i++)
{
for (int j=0; j <= 2; j++)
{
COLORREF tmpPixel = pDC->GetPixel(sourcex+(i-(2>>1)),
sourcey+(j-(2>>1)));
float fKernel = kernel[i][j];
rSum += (GetRValue(tmpPixel)*fKernel);
gSum += (GetGValue(tmpPixel)*fKernel);
bSum += (GetBValue(tmpPixel)*fKernel);
kSum += fKernel;
}
}
if (kSum <= 0)
kSum = 1;
rSum/=kSum;
gSum/=kSum;
bSum/=kSum;
The source code included performs some common image convolutions. Also included is a Convolve Image menu option that allows users to enter their own kernel. Common 3x3 kernels include:
gaussianBlur[3][3] = {0.045, 0.122, 0.045, 0.122,
0.332, 0.122, 0.045, 0.122, 0.045};
gaussianBlur2[3][3] = {1, 2, 1, 2, 4, 2, 1, 2, 1};
gaussianBlur3[3][3] = {0, 1, 0, 1, 1, 1, 0, 1, 0};
unsharpen[3][3] = {-1, -1, -1, -1, 9, -1, -1, -1, -1};
sharpness[3][3] = {0,-1,0,-1,5,-1,0,-1,0};
sharpen[3][3] = {-1, -1, -1, -1, 16, -1, -1, -1, -1};
edgeDetect[3][3] = {-0.125, -0.125, -0.125, -0.125,
1, -0.125, -0.125, -0.125, -0.125};
edgeDetect2[3][3] = {-1, -1, -1, -1, 8, -1, -1, -1, -1};
edgeDetect3[3][3] = {-5, 0, 0, 0, 0, 0, 0, 0, 5};
edgeDetect4[3][3] = {-1, -1, -1, 0, 0, 0, 1, 1, 1};
edgeDetect5[3][3] = {-1, -1, -1, 2, 2, 2, -1, -1, -1};
edgeDetect6[3][3] = {-5, -5, -5, -5, 39, -5, -5, -5, -5};
sobelHorizontal[3][3] = {1, 2, 1, 0, 0, 0, -1, -2, -1 };
sobelVertical[3][3] = {1, 0, -1, 2, 0, -2, 1, 0, -1 };
previtHorizontal[3][3] = {1, 1, 1, 0, 0, 0, -1, -1, -1 };
previtVertical[3][3] = {1, 0, -1, 1, 0, -1, 1, 0, -1};
boxBlur[3][3] = {0.111f, 0.111f, 0.111f, 0.111f,
0.111f, 0.111f, 0.111f, 0.111f, 0.111f};
triangleBlur[3][3] = { 0.0625, 0.125, 0.0625,
0.125, 0.25, 0.125, 0.0625, 0.125, 0.0625};
Last but not least is the ability to show a convoluted image as a grayscale result. In order to display a filtered image as grayscale, we just add a couple lines to the bottom of the Convolve
function:
if (bGrayscale)
{
int grayscale=0.299*rSum + 0.587*gSum + 0.114*bSum;
rSum=grayscale;
gSum=grayscale;
bSum=grayscale;
}
clrReturn = RGB(rSum,gSum,bSum);
This means that the entire Convolve
function now looks like:
COLORREF CImageConvolutionView::Convolve(CDC* pDC, int sourcex,
int sourcey, float kernel[3][3], int nBias,BOOL bGrayscale)
{
float rSum = 0, gSum = 0, bSum = 0, kSum = 0;
COLORREF clrReturn = RGB(0,0,0);
for (int i=0; i <= 2; i++)
{
for (int j=0; j <= 2; j++)
{
COLORREF tmpPixel = pDC->GetPixel(sourcex+(i-(2>>1)),
sourcey+(j-(2>>1)));
float fKernel = kernel[i][j];
rSum += (GetRValue(tmpPixel)*fKernel);
gSum += (GetGValue(tmpPixel)*fKernel);
bSum += (GetBValue(tmpPixel)*fKernel);
kSum += fKernel;
}
}
if (kSum <= 0)
kSum = 1;
rSum/=kSum;
gSum/=kSum;
bSum/=kSum;
rSum += nBias;
gSum += nBias;
bSum += nBias;
if (rSum > 255)
rSum = 255;
else if (rSum < 0)
rSum = 0;
if (gSum > 255)
gSum = 255;
else if (gSum < 0)
gSum = 0;
if (bSum > 255)
bSum = 255;
else if (bSum < 0)
bSum = 0;
if (bGrayscale)
{
int grayscale=0.299*rSum + 0.587*gSum + 0.114*bSum;
rSum=grayscale;
gSum=grayscale;
bSum=grayscale;
}
clrReturn = RGB(rSum,gSum,bSum);
return clrReturn;
}
Last but not least, I did a little tweaking to get the program to load a default image from a resource (IDB_BITMAP1
). Then, I added the ability to convolve this default image. The program will still load image from a file, the only difference is that it will now show a default image at startup.
Please note that this article is, by no means, an example of fast processing of pixels. It is merely meant to show how convolution can be done on images. If you would like a more advanced image processor, then feel free to email me with the subject "WANT CODE:ImageEdit Please". That is an unreleased image processor I have done, though parts are not implemented yet due to lack of time, that contains much more functionality, using the CxImage library as its basis for reading and saving images.