An Implementation Of The Connected Component Labelling Algorithm

Torben Trindkaer Nielsen

4.94/5 (14 votes)

1 Oct 2014CPOL4 min read

75.1K

2.1K

This article presents the recursive connected component labelling algorithm with a workaround for the stack limitation. All in less than 70 lines of C/C++ code.

Download source - 838 B

Sample Image of Connected Component Labelling

Introduction

The iterative solution to the connected component labelling algorithm is well described in the literature, but requires quite complex methods when implemented. The simpler recursive solution has the problem of using more stack than usually available, even for small images.

This article presents the recursive 4-connected component labelling algorithm with a workaround for the stack limitation. All in less than 70 lines of C/C++ code. The implementation is written in pure C, so it can be used in any C/C++ project no matter what the platform and compiler are.

Background

The connected component labelling is often used in the fields of computer vision and image analysis.

Using the Code

The code consists of a single source file. To use it, simply include the code into your C/C++ project and call the function LabelImage.

C++

void LabelImage(unsigned short width, unsigned short height, unsigned char* input, int* output);

Input image is an array of bytes with 0 values being background and other values (typically 1 or 255) indicating an object. It is often found by thresholding.

Output image (the labelled image) is an array of integers with 0 values being background and label numbers starting with 1 up to the number of labels found.

Output image must be allocated in advance and initialized to zero.

Limitations

Sizes of input and output images are limited to a maximum of width x height = (65535 x 65535 pixels) due to unsigned short typically being 16 bit.

Sizes of input and output images are also limited by the available heap size (about "6*width*height" bytes heap memory is allocated temporarily, which is in the order of 1.5 times the size of the output image).

Example of how to use is shown below:

C++

unsigned short W = 640;
unsigned short H = 480;
unsigned char* input  = (unsigned char*) malloc(W*H*sizeof(unsigned char));
int*           output = (int*)           malloc(W*H*sizeof(int));
/* -- TODO: You will need to initialize the input image with whatever image you want to label! */
memset(output, 0, W*H*sizeof(int));
LabelImage(W, H, input, output);

The data of the input and output images is layed out directly as continuous IPL images or matrices in OpenCV (which I think is a standard way of distributing rows and columns of data in images). This makes it very easy to use OpenCV together with the algorithm:

Example of how to use in an OpenCV project is as shown below:

C++

unsigned short W = 640;
unsigned short H = 480;
Mat input (H, W, CV_8UC1);
Mat output(H, W, CV_32SC1, 0);
/* -- TODO: You will need to initialize the input image with whatever image you want to label! */
LabelImage(W, H, (unsigned char*)input.data, (int*)output.data);

To initialize the input image in the above examples, the distribution of rows and columns must be known as stated previously.

It is exemplified by the following simple demonstration, a dummy image with (width, height) = (8,6) pixels:

Rows are numbered 0 .. 5
Columns are numbered 0 .. 7
Shown pixel value of 255 shown at index = column + row*width = 6 + 2*8 = 22, i.e. input[22] = 255.

Implementation

As the method is a modification of the standard recursive algorithm, let's first show how this look like.

The standard 4-connected component recursive algorithm written in C/C++ (recursive procedure LabelComponent comes in the next code snippet):

C++

void LabelImage(unsigned short width, unsigned short height, unsigned char* input, int* output)
{
  int labelNo = 0;
  int index   = -1;
  for (unsigned short y = 0; y < height; y++)
  {
    for (unsigned short x = 0; x < width; x++)
    {
      index++;
      if (input [index] == 0) continue;   /* This pixel is not part of a component */
      if (output[index] != 0) continue;   /* This pixel has already been labelled  */
      /* New component found */
      labelNo++;
      LabelComponent(width, height, input, output, labelNo, x, y);
    }
  }
}

The recursive part comes here in the procedure LabelComponent:

C++

void LabelComponent(unsigned short width, unsigned short height, 
unsigned char* input, int* output, int labelNo, unsigned short x, unsigned short y)
{
  int index = x + width*y;
  if (input [index] == 0) return;   /* This pixel is not part of a component */
  if (output[index] != 0) return;   /* This pixel has already been labelled  */
  output[index] = labelNo;

  /* Now label the 4 neighbours: */
  if (x > 0)        LabelComponent(width, height, input, output, labelNo, x-1, y);   /* left  pixel */
  if (x < width-1)  LabelComponent(width, height, input, output, labelNo, x+1, y);   /* right pixel */
  if (y > 0)        LabelComponent(width, height, input, output, labelNo, x, y-1);   /* upper pixel */
  if (y < height-1) LabelComponent(width, height, input, output, labelNo, x, y+1);   /* lower pixel */
}

Now we only need to rewrite procedure LabelComponent to avoid stack overflow. But to do this, first we must understand what happens when a procedure is called:

Calling a procedure consists of putting the procedure parameters onto a stack together with the program address just after the procedure. Then jump to the procedure address (see macro CALL_LabelComponent below). All parameter variables in the procedure then refer to their actual stack position.
When the procedure ends (the return points in the procedure - see macro RETURN below), pop the values of the stack and jump to the address just after the procedure (which also was put onto the stack).

The whole program has been re-written with own implemented stack (the four macro names CALL_LabelComponent, RETURN, X, and Y should increase readability):

CALL_LabelComponent corresponds to a recursive call of the LabelComponent procedure
RETURN indicates the end of a call to the LabelComponent procedure
X and Y (upper case) resemble the local variables x and y (lower case)

C++

#define CALL_LabelComponent(x,y,returnLabel) { STACK[SP] = x; 
STACK[SP+1] = y; STACK[SP+2] = returnLabel; SP += 3; goto START; }
#define RETURN { SP -= 3;                \
                 switch (STACK[SP+2])    \
                 {                       \
                 case 1 : goto RETURN1;  \
                 case 2 : goto RETURN2;  \
                 case 3 : goto RETURN3;  \
                 case 4 : goto RETURN4;  \
                 default: return;        \
                 }                       \
               }
#define X (STACK[SP-3])
#define Y (STACK[SP-2])

void LabelComponent(unsigned short* STACK, unsigned short width, unsigned short height, 
unsigned char* input, int* output, int labelNo, unsigned short x, unsigned short y)
{
  STACK[0] = x;
  STACK[1] = y;
  STACK[2] = 0;  /* return - component is labelled */
  int SP   = 3;
  int index;

START: /* Recursive routine starts here */

  index = X + width*Y;
  if (input [index] == 0) RETURN;   /* This pixel is not part of a component */
  if (output[index] != 0) RETURN;   /* This pixel has already been labelled  */
  output[index] = labelNo;

  if (X > 0) CALL_LabelComponent(X-1, Y, 1);   /* left  pixel */
RETURN1:

  if (X < width-1) CALL_LabelComponent(X+1, Y, 2);   /* right pixel */
RETURN2:

  if (Y > 0) CALL_LabelComponent(X, Y-1, 3);   /* upper pixel */
RETURN3:

  if (Y < height-1) CALL_LabelComponent(X, Y+1, 4);   /* lower pixel */
RETURN4:

  RETURN;
}

void LabelImage(unsigned short width, unsigned short height, unsigned char* input, int* output)
{
  unsigned short* STACK = (unsigned short*) malloc(3*sizeof(unsigned short)*(width*height + 1));
  
  int labelNo = 0;
  int index   = -1;
  for (unsigned short y = 0; y < height; y++)
  {
    for (unsigned short x = 0; x < width; x++)
    {
      index++;
      if (input [index] == 0) continue;   /* This pixel is not part of a component */
      if (output[index] != 0) continue;   /* This pixel has already been labelled  */
      /* New component found */
      labelNo++;
      LabelComponent(STACK, width, height, input, output, labelNo, x, y);
    }
  }

  free(STACK);
}

Points of Interest

The described implementation is so simple, that it is easy to customize to your own need, for example like:

Change it to find the 8-connected components.
Change data type of input or output image to your needs.
By just modifying the 2 lines of code that test for the input image pixel being zero with a test against a threshold, you will save running through the image for doing a fixed threshold first.

Apart from being an easy to use and easy to customize algorithm solving the connected component labelling problem, it also presents a method to implement a recursive algorithm that has no easy iterative solution, even though here it would result in stack overflow. The drawback is that the method is not generally applicable, only in the cases where the needed stack size is within what normally can be allocated at the heap.

The CALL and RETURN macros in the implementation can be made smarter. Giving up on the cross platform and ANSI C compliance, the GNU GCC compiler allows using labels as values (the && operator extension). Thus the switch statement in the RETURN macro can be avoided, but at the cost of using pointer sized entries in the stack.

History

2014-10-01 First version of article

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)