We can bisect an image into two areas of similar colour by taking the principal component of variation and then using Otsu thresholding to select the point at which to cut. Further bisections of clusters with the highest error are then bisected until a segmentation of the desired number of segments is produced. This algorithm is invented by myself and so is not well tested.
Introduction
It's often necessary to segment images into areas of similar colour. This can be used as a low level step in AI and image recognition, but the obvious immediate application is to choose a palette of colours to convert a 24-bit rgb truecolour image into a 256 colour paletted .gif.
Background
The standard algorithm is k means clustering, with is a general method for finding clusters of data given a arbitrary distance function. Principal component analysis is a new method that uses the information that images have only three components of variation - red, green, blue, or equivalents in other spaces. The principal component of variance is the axis in RGB (for the sake of argument) colour space which is the longest if we cut by the first and last data point to like along it. And by taking that component, which we do by finding the eigenvectors, effectively we have a system which is colour space neutral. To find two clusters, we cut along a plane prependicular to that principal component. And the way to find the place to cut is not to take the average, but to use Otsu thresholding to find where the most natural division into two types lies. Having achieved a bisection, it's very simple to perform further bisections.
Using the code
Eigenvectors are a property of a matrix such as that when the eigenvectors are mutlplied by the matrix, the eigenvectors remain the same, apart from scaling. And conventionally the eigenvectors are normalised, whilst the scaling is called the eigenvalues. And they have many mathematical properties. But the one we are interested in is that when the matrix is a matrix of the covariances of a list of series of equal length, then the eigenvectors represent the major axes of variance, as given by the eigenvalue. So the biggest eigenvector in RGB space is the one along which to cut.
And since pixels only have three components, our matrix is only a 3x3, which means that we can use special and very efficient code to find the eigenvectors.
#ifndef _eig_h
void eigen_decomposition(double A[3][3], double V[3][3], double d[3]);
#endif
Now we've got a way of getting eigenvectors, we can do principal component analysis for pixels in rgb space.
static int pca(double *ret, unsigned char *pixels, int N)
{
double cov[3][3];
double mu[3];
int i, j, k;
double var;
double d[3];
double v[3][3];
for(i=0;i<3;i++)
mu[i] = meancolour(pixels, N, i);
for(i=0;i<3;i++)
for(j=0;j<=i;j++)
{
var =0;
for(k=0;k<N;k++)
{
var += (pixels[k*3+i] - mu[i])*(pixels[k*3+j] - mu[j]);
}
cov[i][j] = var / N;
cov[j][i] = var / N;
}
eigen_decomposition(cov, v, d);
ret[0] = v[0][2];
ret[1] = v[1][2];
ret[2] = v[2][2];
return 0;
}
That should be all pretty straightforwards, just set up the calls to eigen_decomposition right.
And the the guts of it. First the PALENTRY structure.
typedef struct
{
int start;
int N;
double error;
unsigned char red;
unsigned char green;
unsigned char blue;
} PALENTRY;
start is an index into an array of pixels and N is the count. Otherwise nothing surprising here. We can use any colour distance function to calculate the error.
And now we actually split the entry.
static void splitpca(PALENTRY *entry, PALENTRY *split, unsigned char *buff)
{
int low, high;
int cut;
double comp[3];
unsigned char temp;
int i;
pca(comp, buff + entry->start*3, entry->N);
cut = getOtsuthreshold2(buff+entry->start*3, entry->N, comp);
low = 0;
high = entry->N -1;
while(low < high)
{
while(low < high && project(&buff[((entry->start+low)*3)], comp) < cut)
low++;
while(low < high && project(&buff[((entry->start+high)*3)], comp) >= cut)
high--;
if(low <= high)
{
for(i=0;i < 3;i++)
{
temp = buff[(entry->start+low)*3+i];
buff[(entry->start+low)*3+i] = buff[(entry->start+high)*3+i];
buff[(entry->start+high)*3+i] = temp;
}
}
low++;
high--;
}
split->start = entry->start + low;
split->N = entry->N -low;
entry->N = low;
calcerror(entry, buff);
calcerror(split, buff);
}
And finally the Otsu thresholding.
static int getOtsuthreshold2(unsigned char *rgb, int N, double *remap)
{
int hist[1024] = {0};
int wB = 0;
int wF;
float mB, mF;
float sum = 0;
float sumB = 0;
float varBetween;
float varMax = 0.0f;
int answer = 0;
int i;
int k;
for(i=0;i<N;i++)
{
int nc = rgb[i*3] * remap[0] + rgb[i*3+1] * remap[1] + rgb[i*3+2] * remap[2];
hist[512+nc]++;
}
for (k=0 ; k<1024 ; k++)
sum += k * hist[k];
for(k=0;k<1024;k++)
{
wB += hist[k];
if (wB == 0)
continue;
wF = N - wB;
if (wF == 0)
break;
sumB += (float) (k * hist[k]);
mB = sumB / wB;
mF = (sum - sumB) / wF;
varBetween = (float)wB * (float)wF * (mB - mF) * (mB - mF);
if (varBetween > varMax)
{
varMax = varBetween;
answer = k;
}
}
return answer - 512;
}
We have to tweak the standard Otsu function slighty because we want to cut in eigen-space, not along luminance as we do normally. So we pass a remap parameter.
Results
The standard test images are avialable from the University of Southern California.
https://sipi.usc.edu/database/database.php?volume=misc&image=11#top
And it's a bit unfair to compare results to other algorithms. A well established algorithm will be tweaked and refined many times, and might produce better results despite the core underlying method being weaker. But thse results are pretty nice, and certainly usable.
Points of Interest
Palette choosing seemed to die the death when displays went to 24 bits. But now it's had a bit of a revival with the return of the animated gif, and the need for realistic movies.
History