Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / OpenCV

Mean Shift Tracking

5.00/5 (3 votes)
24 Sep 2013CPOL4 min read 21.4K  
In this article we will look at the application of Mean Shift Tracking for color based tracking.

Introduction 

In the article we will look at the application of Mean Shift Tracking for color based tracking.

Mean shift 

The object model used in mean shift tracking is color probability distribution.

Now we have a object model,given an image we can compute the likelihood image Each pixel in likelihood image represents the likelihood that pixel belongs to the object model/histogram.

original image
likelihood image Object model
Hue histogram
 
 
fig:image1
This likelihood image assigns to each pixel a similarity measure wrt object model.

It is reasonable to assume that the region in which the highest similarity measure or highest density is observed is a good estimate of object location.  
C#
//building the object appearance model
void ocvmeanShift::buildModel(Mat image,Rect rect)
{
    //input region of interest
    region=rect;
    //center of region is current location estimate
    p.x=region.x+region.width/2;
    p.y=region.y+region.height/2;
    //extract ROI
    Mat roi=image(rect);    
    //compute the histogram ,h is object of type Histogram
    h.BuildHistogram(roi);
}
//call to compute the likelihood after computing the model
Mat sim=h.likeyhoodImage(image);  

Thus if we consider a small window and move towards the mean value ie along the mean shift vector we should eventually reach the region of maximum similarity.

The likelihood surface is not smooth, we can give it properties of smoothness using kernel density estimation.

Now we can find the modes of the similarity surface using standard mean shift algorithm.

Let us assume that current estimate of mode of function is at $y$. Thus we consider a small rectangular window about $y$,compute the mean shift vector and take a small step along the mean shift vector.

In principle this should enable to find the local maximum. Similarity surface is discontinuous and to do this we need to perform KDE over entire image consisting of dense grid of points ,which is a expensive operation.(We need to perform convolution at each point with Gaussian with suitably large aperture)

Using Similarity for tracking

The concept of similarity surface can be made useful in tracking application.

Let us consider a small region of interest about a present location y, we can compute the similarity score about this region,perform KDE on this small region,obtain a similarity surface and compute the mean shift vector.

If object is not present in region,similarity surface will be flat and mean shift vector will be zero.

If there is object present in some part of region,it will correspond to modes of similarity surface .The mean shift vector will give us direction to move along.

Now instead of trying to estimate the mode,say we translate the region of interest along direction provided by the mean shift vector. This would typically lead to large portion of object being visible and would expose the region of global similarity surface where a large maximum would lie.

This is the basis of mean shift tracking, keen on translating the region of interest ,till we reach local maximum of similarity surface.

For tracking applications ,since fast computation is required, we can consider a rectangular window with bandwidth equal to that of the region of interest.The present location of point point is the center of the rectangular region.

Implementation 

C#
//compute the likelihood
Mat sim=h.likeyhoodImage(image);
//perform iteratively till convergence
    for(int i=0;i<criteria.maxCount;i++)
    {
    //extracting the region of interest
    Mat roi=sim(region);
    //compute the moments
    cv::Moments m;
    m=cv::moments(roi,false);
    
    //threshold m00 which is weighted mean,
    //exit since no similar pixels present
    if(fabs(m.m00)<region.width*region.height*0.05)
        break;
    //computing the mean values
    int x=cvRound(m.m10/m.m00);
    int y=cvRound(m.m01/m.m00);
        
    //computing the mean shift
    int dx=region.width/2-x;
    int dy=region.height/2-y;
    //displacement from current position
    int nx=p.x-dx;
    int ny=p.y-dy;
    //bounday of the image
    if(nx-region.width/2<=0) nx=region.width/2;
    if(nx+region.width/2>=image.cols) nx=image.cols-region.width/2-1;
    if(ny-region.height/2<=0) ny=region.height/2;
    if(ny+region.height/2>=image.rows) ny=image.rows-region.height/2-1;
        //recalculating the mean shift
    dx=-nx+p.x;
    dy=-ny+p.y;
    //checking magnitude of mean shift vector.
    float mag=dx*dx+dy*dy;
    //no change in mean,reached local maxima
    if(mag<criteria.epsilon*criteria.epsilon)
        break;
   //updating the position
    p.x=nx;
    p.y=ny;
    //updating the region of interest
    region.x=p.x-region.width/2;
    region.y=p.y-region.height/2;
    } 

<criteria.maxcount;i++) roi="sim(region);" m="cv::moments(roi,false);" x="cvRound(m.m10/m.m00);" y="cvRound(m.m01/m.m00);" dx="region.width/2-x;" dy="region.height/2-y;" nx="p.x-dx;" ny="p.y-dy;" 2<="0)"><criteria.epsilon*criteria.epsilon) p.x="nx;" p.y="ny;" region.x="p.x-region.width/2;" region.y="p.y-region.height/2;"> A video of mean shift tracking is shown below.A naive object model based on color probability in HS color space using first frame of the video



Another issue is that if the object is moving too fast and significant part of the object moves out of ROI in successive frames,the object will not be tracked. This can be seen in the following videos



If we encounter a larger object or object that exhibits higher density ,tracking will be lost.In the below case the tracking is lost when object passes over a large blue background which is similar to object color.



There are many other cases where mean shift tracking will fail

As will all tracking approaches ,the performance heavily depends on the object model. The better we are able to model the object and obtain a likelyhood/similarity which does not show high probability for background or other objects in the scene,the more accurate will be the tracking

Code 

For further image processing application a library consisting of high level interface to opencv will be used.The library is called OpenVisionLibrary. https://github.com/pi19404/OpenVision

The project cmake file is included in the repository. the build will create the library and test files in the bin directory To run demo program for mean shift run the binary meanShiftTest

The files for mean shift algorithm are meanshift.cpp and meanshift.hpp https://github.com/pi19404/OpenVision/tree/master/ImgProc/repository.

To run the test program :  
meanShift - to run using camera input
meanShiftTest {video file name} - to run using a video file 

<criteria.maxcount;i++) roi="sim(region);" m="cv::moments(roi,false);" x="cvRound(m.m10/m.m00);" y="cvRound(m.m01/m.m00);" dx="region.width/2-x;" dy="region.height/2-y;" nx="p.x-dx;" ny="p.y-dy;" 2<="0)"><criteria.epsilon*criteria.epsilon) p.x="nx;" p.y="ny;" region.x="p.x-region.width/2;" region.y="p.y-region.height/2;"> Select the region of interest and click the build model button to start tracking.

For video file initially only the first frame is show ,select the ROI in the first frame and then click on build model button to start the tracking.

The button is shown upon clicking on Display properties button on the window.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)