Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / multimedia / GDI+

Visual Surveillance Laboratory

4.85/5 (45 votes)
6 Aug 2008Apache8 min read 4   11K  
A description of surveillance systems

Sample Image - maximum width is 600 pixels

Introduction

Visual surveillance is an attempt to detect, recognize and track certain objects from image sequences, and more generally to understand and describe object behaviors[1].

The purpose of this two part article is to describe a framework of visual surveillance systems and sample algorithms in order to help programmers around the world learn about this subject. This part reviews the basic structure of surveillance systems and the second will demonstrate an algorithm used in such systems.

Although in his article, Andrew Kirillov[2] discusses this subject in detail, he presents a narrow view of how a surveillance system should be. These articles attempt to provide a wider insight about surveillance systems structure.

Motivation for Surveillance Systems

  • Anomaly Detection

    The ability to learn what normal behaviors and anomalies are.
    For example it is known that in an office building entrance, people usually go straight to the lobby after entering the building. Thus individuals that go in a different direction can be considered harmful.

  • Automated Security

    Increases effectiveness of the entire surveillance process by paying attention only to certain events instead of watching and analyzing several different surveillance cameras as happens today. Such a system can also decrease costs.

  • Crowed Flux Statistics

    For example, by knowing how many cars are using a certain road and how many are coming from road A or B we can decide which road to widen.

  • Blood Squirting Halloween skull[6]

    Just a nice usage found on coding4fun, not all reasons have to be dead serious.

General Structure of Visual Surveillance Systems

First the system receives a stream of images and then tries to learn several important facts from it:

  1. Are there any objects (meaning people, cars, suitcases, body limbs and so on).
  2. Determining type of objects, (i.e., is it a car or a person etc.).
  3. Current position of the objects in x,y coordinates (maybe even z).
  4. Interpreting objects' positions whether it means something e.g., is this car in position 10,50 driving in a valid trajectory.

Usually the first 3 are considered low level and the 4th is a high level.

The general structure below is not mandatory however a lot of the existing systems follow it.

Image 2

Camera 1

Usually a hardware device although we can use visual surveillance systems on saved video files as well. The camera provides a stream of images (also referred to as frames) which is the input. It is important to note that although we can learn a lot of important features from a single image, it is not enough.

Environment Modeling

The philosophy is simple, not all pixels are of interest like those that belong to the background of the image. The background means that for example, in a scene where two people are walking in front of a tree, the tree is considered to be background. In extracting the background, the calculations can be much more accurate.

Look at the following image:

Image 3

As you can see, we managed to distinguish between the people (foreground) and their background surroundings.

Motion Segmentation

Trying to detect regions in the image that correspond to moving objects. Why? Because moving objects are what interest us. They help us save computational time by focusing only on them. However it is important to remember that sometimes stationary objects are precisely what is of interest like in the case of a car that stops in front of a house for a whole day where the lack of movement is what makes it suspicious.

Object Classification

As stated before, classification (determining the type) means giving each object a class it belongs to, you will probably want to ignore certain classes of objects, for example birds.

Tracking

It is important to understand that tracking is not motion detection (motion segmentation). It is identifying the same object in different frames. For instance a person who walks in front of the camera in frame 1 will be identified as the same person in consecutive frames. This provides the trajectory that this person took during his entire trip.

For example look at the following image:

Image 4

You can see that people are being tracked for the entire scene.

The main problem in tracking is occlusion where an object is being concealed by another object like a tree, a car etc.

Behavior Understanding/Personal Identification

Here we can put various learning algorithms or any kind of algorithms that manipulate the data that was gathered.

Code Overview

The main idea is to create a common GUI that supports plug-ins. It will display the final result and then surveillance algorithms will be added as an external library. This architecture of mix and match of various algorithms enables modularity. Another advantage is educational since students can plug various algorithms and see how well they operate on various situations.

The code contains 4 main projects and 3 utility projects.

Here we shall review how to encapsulate the structure of the visual surveillance system.

Core Project

Image 5

This core project in a sense is a project of interfaces. Most of the interfaces here are helper interfaces. Any external plug-in library must implement the ISurveillanceSystem interface before it can be added to the system. Let us look at this interface.

Image 6

This interface defines properties and methods that allow the visual environment to query what the abilities of the underlying algorithms are. Most of these properties correspond directly to a GUI feature like one that enables or disables a certain menu or button. An example, does the HasRuntimeInformation return true if this algorithm supports the ability to show its inner working in GUI manner. If this property returns true, then the environment can query RuntimeInformation to retrieve a form which will be displayed. Many of the properties in this interface act in the same way.

An important method is GetImageProcess which returns an object that inherits from the IImageProcess interface.

Image 7

This interface encapsulates the entire surveillance algorithm and has only one method that receives an image and returns a collection of blobs. Blobs are objects which were identified in the tracking part and have a unique id and position.

Another important interface is the IOutputSystem.

Image 8

The IOutputSystem allows to output the results of the IImageProcess object to a different format, for example a file or maybe a learning algorithm.

A delegate which draws the final result of the IImageProcess on screen, for example, draws a rectangle around all tracked objects.

Image 9

We are now ready to fully understand the work flow of the surveillance system.

C#
frame <-- Get frame from camera
trackedObjects <-- IImageProcess.ProcessFrame(frame)
IOutput.Output(trackedObjects)
GraphicalOutputDelegate(trackedObjects, frame)
return frame

Visual Surveillance Laboratory Project

The laboratory project contains the main user interface and code to load available plug-ins into memory.

The GUI part will not be described, you can view it yourself, instead I will describe the plug-in mechanism.

Image 10

This project supports two types of plug-ins: Tracking systems and Output systems. In order to load the available plug-ins the same scheme is used in both cases, that is a factory class that parses an XML file which contains the details of disk location and namespace location, the factory class tries loading the plug-ins into memory.

Here is an example of the XML file.

XML
<Configuration>
  <TrackingSystem location="...\bin\Debug\SimpleTrackingSystemExample.dll" 
    class="Zoombut.SimpleTrackingSystemExample.ExampleTrackingSystem"/>
  <OutputSystem location="...\bin\Debug\SimpleFileOutputSystemExample.dll" 
    class="Zoombut.SimpleFileOutputSystemExample.ExampleOutputSystem"/>

</Configuration>

The code itself is quite straight forward.

C#
// Load available tracking systems.

XmlDocument document = new XmlDocument();
document.Load(Settings.Default.ConfigFile);
// Use xml path.

XmlNodeList result = 
    document.SelectNodes("/Configuration/TrackingSystem");
foreach (XmlNode node in result) {
// Load values.

String location = node.Attributes["location"].InnerText;
String className = node.Attributes["class"].InnerText;
// Create instance.

ISurveillanceSystem value = 
    (ISurveillanceSystem)Activator.CreateInstanceFrom
                (location, className).Unwrap();
availableTrackingSystems.Add(value);

Currently it is not very robust, there is no error handling here, there are things that will be added on the next version of this system.

User Guide

In order to start a surveillance mission you first need to make sure that the configuration file points to the right location of the various plug-ins. Trying to run the application without changing the config.xml file will result in an empty plug-in list.

You start a surveillance mission by choosing configure from the menu and the following dialog should appear.

Image 11

You can see that the configure window contains three parts:

  1. Input

    From where the stream of images will come from, it can either be from a camera attached to the computer or via a regular video file (currently only supporting Avi files). You can download an avi test file from the project web site on http://code.google.com/p/vsl/ . The movie file is originally located at [5] .
    This section was built using the AForge.NET framework[3].

  2. Surveillance Systems

    Lists the available plug-ins that encapsulate the surveillance algorithms. If a system can be configured then the configure button will be enabled.

  3. Output Systems

    Lists the available plug-ins that outputs the final result of the surveillance algorithms, if a system can be configured then the configure button will be enabled. After clicking Ok the main window should write a "Connected" message which means you are ready to start. Press Start or Stop to either start or stop running the algorithm.

Future work

  1. Improve robustness of software.
  2. Debug, debug and debug...
  3. Add more example algorithms.
  4. Add support for recording.

Conclusion

  1. We learned about the basic structure of a surveillance system. It contains 5 main parts: environment modeling, motion segmentation, classification, tracking and behavior understanding.
  2. We described the basic structure of a plug-in in our system and showed how to load it dynamically.

Bibliography

  1. A Survey on Visual Surveillance of Object Motion and Behaviors
  2. Motion Detection Algorithms - A wonderful article by Andrew Kirillov
  3. AForge.NET framework
  4. Tracking groups of people
  5. CAVIAR Test Case Scenarios

Continue to part 2 of the article Visual Surveillance Laboratory Part 2

Application Used By

History

  • [03.08.2008] New version of the application was uploaded to here
  • [21.05.2007] Version 1.0.3 - Added link to part 2 of the article
  • [11.05.2007] Version 1.0.2 - Added support for mjpeg.
    Always check for newest version here
  • [09.05.2007] Version 1.0.1 - Added dshow library to download section.
  • [02.05.2007] Version 1.0.0

License

This article, along with any associated source code and files, is licensed under The Apache License, Version 2.0