Introduction
Some basic image processing functions involve manipulation of the pixels using filters or histogram based functions that modify the pixel distribution. Some of these enhance the image’s display in various ways or remove noise. This article will describe C++ code that was developed for a MFC multi-document interface (MDI) image processing application.
Background
The application, called Imagr
(spelled without an “e” for historical reasons), is built as a MFC multi-document interface and utilizes the Microsoft ATL CImage
class, since it already has the built-in capability for opening and saving images in the most popular formats (bmp, jpg, tif, gif, and png). Imagr
also has the capability of opening pcx type images (thanks to Roger Evans for the original code), certain ASCII text images, and certain “raw” binary type images. The CImage
member function CImage::Load
reads the image file into a bottom-up (origin at lower-left corner of image) device-independent bitmap (DIB). However, this can be inconvenient for accessing pixels, so the image is reformatted into a top-down (origin at top-left corner) DIB and 32 bit (since most display adapters are now true color). See Convert32Bit()
code.
Once in top-down mode, pixel manipulations are made much simpler. CImage::GetBits()
can be used to return a pointer to the first pixel (top-left pixel) and then subsequent pixels are accessed by simply incrementing the pointer in a single for
-loop. Images are categorized into one of three pixel types: grey scale, color, and integer (sometimes called raw, these were derived from a special data acquisition process). Images that are only grey scale are handled faster than color images which have three color channels. Raw integer images can have more bit depth for processing, but must be reduced to 8 bit (values 0…255) to be displayed.
An important thing to mention is that Microsoft’s Cimage
class stores color bits differently than the usual bitmap. Instead of being RGB, its actually BGR (blue, green, red), so if you use the typical GetRValue()
function for example, to get the red bits, it will return the blue bits instead. And if you store the red, green, and blue values with the RGB(red, green, blue) macro, the red and blue will be incorrectly displayed. The green bits are the same, just the blue and red are flipped. So in the code (see ImagrDoc.h), the following definitions are used to help keep the colors straight.
#define RED(rgb) (LOBYTE((rgb) >> 16))
#define GRN(rgb) (LOBYTE(((WORD)(rgb)) >> 8))
#define BLU(rgb) (LOBYTE(rgb))
#define BGR(b,g,r) RGB(b,g,r)
So, for example, RED(p)
will return the red bits as expected from the passed pixel p
, and BGR(b, g, r)
stores the red, green, and blue bytes into the 32 bit pixel for correct display.
MDI Advantages
Designing Imagr
as a MDI application offers benefits of being able to compare images or do two-image operations (as explained below). In addition, it’s important to have the capability of making a copy of an image, so for example, you can save the state of a filter operation before trying other filters, or do a two-image operation with a copy of the image or a processed copy of the image. The MDI also does some important background chores. With a call to SetModifiedFlag()
, the image’s state of change is maintained, so that if the image is closed, the MDI will automatically prompt the user to save it first. The MDI also enables files to be dragged and dropped into the application, and coordinates which image is the active one with a mouse click.
Using the Code
The application’s code is included as a complete project for use with Microsoft Visual Studio 2008 or 2010.
Histogram Functions
In the context of the current application, a histogram is a graphical representation of the distribution of pixel values in an image. A count of the number of each value of pixel is maintained in a 256 dimensioned array (or 256 x 3 for color images) and graphed to a dialog window. (Note: In the Imagr
code, you will see that pixels < 0 and > 255 are also shown in the histogram in order to handle the raw type of full integer imagery). A grey scale image and its associated histogram is shown below. As can be seen in the histogram, the majority of pixels in this image lie in the region 32…112.
Fig. 1 - Grey scale image with histogram
An example of a color image histogram (below) for a color image shows the red, green, and blue channel distributions.
Fig. 2 - Full color image histogram
With a normalization function (also known as contrast stretching), the histogram curve can be stretched or compressed to the desired range. Usually, this is done to expand the pixel range to the full range of intensities (0 to 255) giving a more evenly contrasted image. The figure below shows an example of a low contrast image before and after normalizing with the associated histograms. Normalization effectively redistributes the histogram without appreciably altering the histogram curve.
Fig. 3 - Image with histograms before (top) and after normalizing (bottom)
The normalize algorithm that operates on every pixel in the image is:
pixel = (pixel - min)*(nmax - nmin) / (max - min) + nmin
where max
and min
are the starting maximum and minimum pixel values in the image, and nmax
and nmin
are the new maximum and new minimum pixel values chosen to normalize to.
The normalization code is shown below for handling the three types of image pixels (grey scale, color, and “raw” integers). The RED()
macro is called in the grayscale code to isolate just the lower byte of the integer pixel, which is not really red in this case (for clarity, I should probably have made a GREY()
macro which does the same thing). This function is called from a dual slider class (thanks to includeh10
, CodeProject article “A Slider with Two Buttons”, 9 Aug 2006). The normalize function is tied to the dual sliders so the image can be normalized as you adjust the sliders in near real-time (depending on the image size and system speed). The image is copied to an “undo” buffer before each slider adjustment so that the function operates on the same starting image each time the slider is changed. The sliders control the nmin
and nmax
variables passed to the function.
void CImagrDoc::Nrmlz(int nmin, int nmax)
{
int d;
float factor;
byte r = 0, g = 0, b = 0;
OnDo();
int *min = &(m_image.minmax.min);
int *max = &(m_image.minmax.max);
if (*max - *min == 0)
factor = 32767.; else
factor = (float)((float)(nmax - nmin) / (*max - *min));
int *p = (int *) m_image.GetBits(); unsigned long n = GetImageSize();
switch (m_image.ptype) {
case GREY: for ( ; n > 0; n--, p++) {
r = RED(*p);
d = (int)((float)(r - *min) * factor + nmin + 0.5);
r = (byte)THRESH(d);
*p = BGR(r, r, r);
}
break;
case cRGB: for ( ; n > 0; n--, p++) {
r = RED(*p);
d = (int)((float)(r - *min) * factor + nmin + 0.5);
r = (byte)THRESH(d);
g = GRN(*p);
d = (int)((float)(g - *min) * factor + nmin + 0.5);
g = (byte)THRESH(d);
b = BLU(*p);
d = (int)((float)(b - *min) * factor + nmin + 0.5);
b = (byte)THRESH(d);
*p = BGR(b, g, r);
}
break;
default: for ( ; n > 0; n--, p++) {
r = (int)((float)(*p - *min) * factor + nmin + 0.5);
*p = BGR(r, r, r);
}
m_image.ptype = GREY; break;
}
*min = nmin;
*max = nmax;
UpdateAllViews(NULL, ID_SBR_IMAGEMINMAX);
}
Note that this normalization is restricted to operating on the minimum to maximum pixel range. Sometimes, one may want to expand or contract a narrow range of the histogram. Imagr
includes this functionality in the NrmlzRange()
function with the following algorithm:
pixel = (pixel - rmin)*255 / (rmax - rmin)
where rmax
to rmin
is the chosen range of pixel values. A dual slider is also used to adjust the rmin
and rmax
variables. This allows one to select a range of the histogram to be normalized to the range of 0
to 255
. This process may force pixel values to less than 0
or greater than 255
, which is outside the displayable range. Pixels with values outside the 0
to 255
range get thresheld to the range so they can be displayed. Any pixels < 0
are set equal to 0
, and any pixels > 255
are set equal to 255
. The image and histogram below demonstrate this. The image has higher contrast over its full range at the cost of thresholding done at the outer ends of the histogram.
Fig. 4 - Range normalization
Another popular histogram function “equalizes” or flattens the pixel distribution giving more equal contrast among the entire range of pixels as shown below (thanks to Frank Hoogterp and Steven Caito for the original Fortran code). As can be seen, equalizing may make the image “blotchy” and effectively lose resolution, but may be useful for some images. Equalize uses the histogram array to redistribute the pixels. (See Eqliz()
code for details).
Fig. 5 - Equalization
Imagr
also has a threshold menu function tied to the dual slider so it can be used to limit the range of pixels in the histogram and observe the results in near real-time. This can be useful for selectively cutting off specified intensities in an image.
Image Processing Filters
There are many filters that can be applied for different image processing functions. The convolve
function (Convl.cpp) is used to apply 3 x 3 kernel (matrix) filters to the image. Imagr
currently has menu options for many types of filters including: low pass, high pass, Sobel, Prewitt, Frei-Chen, various edge enhancing and Laplacian filters, emboss filters, and a kernel input dialog so the user can experiment with their own 3 x 3 kernel. The details of what these filters do can be found on the Internet so that won’t be discussed here. Some of these filters were applied to the image from fig. 1 as shown below.
Fig. 6 - High pass Fig. 7 - Sobel Fig. 8 - Edge enhance
Fig. 9 - Emboss Fig. 10 - Laplacian sharp
The convolve
equation looks like this:
P5 = ∑i=1…9 (Ki * Pi) / ∑i=1…9 (Ki)
where P5
= center of a 3 x 3 pixel region, Pi
= each of the nine pixels, and Ki
= each of the nine kernel values. The middle pixel gets changed to the sum of the product of each of its neighboring 3 x 3 pixels (including itself) and the respective kernel values, divided by the sum of the kernel values. This operation is applied to every pixel in the image.
The high pass kernel looks like this:
-1.0, -1.0, -1.0
-1.0, 9.0, -1.0
-1.0, -1.0, -1.0
This basically takes the sum of the inverse of the surrounding pixels plus the center pixel weighted higher (* 9). See ImagrDoc.h for the other kernels used in Imagr
.
A portion of the convolution filter code is shown below, but with just the grey scale section for brevity (see Convl.cpp for full code). The 3 x 3 kernels are passed to the function.
void CImagrDoc::Convl(float k1, float k2, float k3,
float k4, float k5, float k6,
float k7, float k8, float k9)
{
int *p;
unsigned long i, j, nx, ny;
int *m1, *m2, *m3; int *old_r1, *r1, *r2, *r3;
float s, fsum;
int t;
byte r, g, b;
nx = m_image.GetWidth();
ny = m_image.GetHeight();
p = (int *) m_image.GetBits();
if (!(m1 = (int *) malloc((nx+2) * sizeof(*m1)))) {
fMessageBox("Error - " __FUNCTION__, MB_ICONERROR, "malloc() error m1");
return;
}
if (!(m2 = (int *) malloc((nx+2) * sizeof(*m2)))) {
fMessageBox("Error - " __FUNCTION__, MB_ICONERROR, "malloc() error m2");
free(m1);
return;
}
if (!(m3 = (int *) malloc((nx+2) * sizeof(*m3)))) {
fMessageBox("Error - " __FUNCTION__, MB_ICONERROR, "malloc() error m3");
free(m1);
free(m2);
return;
}
r1 = m1;
r2 = m2;
r3 = m3;
memcpy_s(&r1[1], nx * sizeof(int), p, nx * sizeof(int));
r1[0] = r1[1];
r1[nx+1] = r1[nx];
memcpy_s(r2, (nx+2) * sizeof(int), r1, (nx+2) * sizeof(int));
fsum = k1 + k2 + k3 + k4 + k5 + k6 + k7 + k8 + k9;
if (fsum == 0)
fsum = 1; else
fsum = 1/fsum;
OnDo();
BeginWaitCursor();
switch (m_image.ptype) {
case GREY:
for (j = 1; j <= ny; j++, p += nx) {
if (j == ny) {
r3 = r2;
}
else {
memcpy_s(&r3[1], nx * sizeof(int), p + nx, nx * sizeof(int));
r3[0] = r3[1];
r3[nx+1] = r3[nx];
}
for (i = 0; i < nx; i++) {
s = k1 * (float)RED(r1[i])
+ k2 * (float)RED(r1[i+1])
+ k3 * (float)RED(r1[i+2])
+ k4 * (float)RED(r2[i])
+ k5 * (float)RED(r2[i+1])
+ k6 * (float)RED(r2[i+2])
+ k7 * (float)RED(r3[i])
+ k8 * (float)RED(r3[i+1])
+ k9 * (float)RED(r3[i+2]);
t = NINT(s * fsum);
r = (byte)THRESH(t);
p[i] = RGB(r, r, r);
}
old_r1 = r1; r1 = r2;
r2 = r3;
r3 = old_r1;
}
break;
}
EndWaitCursor();
free(m1);
free(m2);
free(m3);
ChkData(); SetModifiedFlag(true); UpdateAllViews(NULL); }
The convolution function code (thanks again to Frank Hoogterp and Steven Caito for the original Fortran code) accesses three rows of the image at a time by storing them in three arrays. Pointers to the row arrays are maintained for ease in shifting rows up as the new row is loaded in from the image. For example, when operating on a new row of the image, pointers to array rows two and three (r2
and r3
) are set to point to r1
and r2
, respectively. So only row three needs to be updated with pixels from the image and this becomes the new r3
(previous pointer to r1
).
The edge rows and columns are handled by doubly weighting them. For example, when operating on a pixel in the 1st row, the 1st row is copied into the 2nd row array in order to still have 3 rows (the 3rd row array contains the 2nd row) for processing. So it’s as if the 1st row was copied above the actual 1st row. Vertical edges (columns) are handled similarly by replicating an additional pixel at the beginning and end of the row.
Two-Image Processing
The two-image functions take two images as input and produce a third image. Some of the operations are: add, subtract, multiply, divide, average, minimum, maximum, and the logical bit-wise functions OR, AND, and XOR. For example, the add
function takes a pixel at position (x, y) from image A and adds it to the corresponding pixel (x, y) in image B, and stores this sum in the corresponding pixel (x, y) in image C. This operation is done on every pixel in the image. The current version of Imagr
only allows same size images for these operations. The subtraction operation yields an image made up of only the differences between both input images. Subtraction is sometimes done after an edge enhancement to show edge effects overlaid onto the original image. The same two images below show the effects of addition and subtraction.
Fig. 11 - Image addition
Fig. 12 - Image subtraction
Undo Stack
As mentioned briefly above, Imagr
also has undo capability. This is very important for an image processing application, since many functions may do undesirable things to an image and it’s important to have an easy way to return to the previous state and try other things. The OnDo
and UnDo
functions (in file Undo.cpp) push
and pop
, respectively, the image from a memory stack, implemented by a linked list. When a change is going to be made to an image, OnDo
is called first to save its current state.
The Undo
data structure looks like this:
struct Undo_type { int *p; BOOL mod; int ptype; char hint[80]; Undo_type *next; };
The variable “mod
” maintains the images’ saved state, so when popping the image off the stack, a call to SetModifiedFlag()
will restore the “saved” state. The “hint
” string
’s purpose is to keep track of certain operations that may be reversed without requiring a complete memory save of the image. For example, if an image is rotated 90 degrees, a hint could be used by the UnDo
function so that it knows just to do a rotate of -90 degrees to restore the image. Therefore all the image pixels don’t need to be saved in memory. Although this undo capability isn’t currently implemented in Imagr
, it can provide faster functioning and save memory resources.
As mentioned above, the undo capability is used by some functions that utilize the slider dialogs to quickly OnDo
(push
) and UnDo
(pop
) the image as the sliders are manipulated. This gives an interactive capability to see in near real-time what the slider operation does to the image.
Conclusion
In conclusion, some basic image processing code for histograms, convolution filters, and two-image operations was discussed. Although there are a lot of image processing applications out there, developing your own can be very worthwhile to have custom functionality. Imagr
contains other image processing functions which have not been discussed here. See my previous article in CodeProject “Drawing an Image as a 3-D Surface”, July 2011, which discusses Imagr
’s 3D graphing capabilities.