Table of Contents
Introduction
Computer Vision Sandbox is an open source software package, which aims to allow solving different problems related to computer vision areas, like, for example, video surveillance, vision-based automation/robotics, different sorts of image/video processing, etc. Initially, the project started with closed code taking some time to settle with its core architecture and developing a set of features which would allow applying it to variety of tasks. Starting with version 2.0, however, the project migrated to an open source repository making its code available for the public.
From the very beginning, the project was designed to provide high modularity and allow extending its features by developing different plug-ins. The main application taken on its own is of little use. It knows how to load plug-ins and stick them together to get some video processing going. But if those plug-ins are missing, there is not much to do other than opening/closing About box. It should not be the case though, as the official installation package comes with a variety of plug-ins allowing to apply them to a number of tasks.
Plug-ins in Computer Vision Sandbox is the core idea and provides most of its features. Those are like building blocks - the end result you get depends on which blocks are taken and how those are combined. What sort of plug-ins are there, you wonder? Well, there are few different types available. The first and the main type of plug-ins are video sources - those who generate the video to be processed. The video may come from USB camera (or laptop's integrated camera), for example, from IP video surveillance camera, from a video file or any other source providing continuous images (video frames). On top of those, there are different types of plug-ins aimed for image and video processing. It can be some image enhancement, adding visual effects, detection of certain objects, saving video archive, etc. To make things more interesting and allow more advanced video processing, there is a scripting plug-in, which allows to write video processing scripts using Lua programming language. Finally, there are also some device plug-ins, which allow talking to some I/O boards, robotics controllers, devices connected over serial port, etc. – this allows developing more interactive applications, where video processing can be affected by real world events and the other way around.
The code base of the project grew to the point where describing all its details in a single article would be quite a big job to do by now. So, this article will only concentrate on few key concepts and features to demonstrate some of the possible use cases. To give a bit of an idea/direction, here is a screenshot demonstrating some of the applications. Those are more into computer vision side of things. But other areas will get described further as well.
Project Features
When the Computer Vision Sandbox project emerged, the idea was to make sort of a Lego building puzzle. It was not aimed for video surveillance only, nor for adding different imaging/video effects, nor for purely computer vision applications, nor for robotics/automation tasks, etc. Instead, it was aimed for everything mentioned above and more. The idea was to make every feature as a plug-in and then let users combine those to get whatever result they want. The main application does not have any idea about any specific camera (video source) or image/video processing routine. It only knows about certain types of plug-ins. Like, some plug-ins provide video, others provide image processing, some others may provide scripting capabilities, etc. But what is actually provided, and how, depends purely on plug-ins. All we need from the main application is just to know how to talk to those plug-ins.
In subsequent sections, we are going to review the main types of plug-ins and how to use those from the Computer Vision Sandbox application.
Video Sources
Video source plug-ins are the foundation of the Computer Vision Sandbox. You may have whatever number of other plug-ins, but if there are no video sources - nothing else can be done. What is a video source? Anything that generates images continuously. It can be a camera attached over USB or IP interface, it can be a video file, a screen capture device, collection of image files in a folder, etc. The main application does not care where images come from, as long as they come.
When adding a video source, it is required to select a plug-in to use for communication with a particular device and then configure its properties. The list of properties is specific to the chosen plug-in type - it can be IP address of camera, URL of MJPEG stream, name/id of USB connected camera, etc. Once configuration is complete, the video source can be opened and watched.
After configuring number of video sources and making sure all of them work as expected, we may create a sandbox to show up to 16 cameras on one view. Any type of video sources can be put into a sandbox, so that cameras of different brands and makes could be opened together.
Cameras' views don't need to have regular grid structure. If we want to make one of the cells larger than others, we can merge few of them to make a bigger one. This allows creating views of various shapes and assign bigger cells to cameras, which may have something more interesting/important to show.
Finally, a sandbox may have multiple views defined, which can be then switched either manually or at certain time intervals. For example, we may have a default view showing all sandbox video sources and then some other views showing particular video sources in larger size.
For every running video source (individually or within a sandbox), it is possible to take a snapshot, which can be then exported into image file or copied into clipboard.
Image/Video Processing
Watching different types of cameras is nice to have, but it would be nice to do something about them. For example, do some image enhancement/processing, add different effects, implement some computer vision applications, save video into files, etc. This is where image and video processing plug-ins come into play.
To add video processing steps for a video source, it is required to put it into a sandbox in the same way as it is done when combining several video sources into a view. So, a sandbox represents sort of container, which may run multiple video sources, execute different image/video processing routines, run scripts (more on this later), etc.
When initial sandbox configuration is done, the video processing steps can be added to its video sources (cameras) by running Sandbox Wizard providing a list of available plug-ins, which can be added as video processing steps. For example, the screenshot below shows 5 image processing plug-ins added as video processing steps for a selected camera. Together, those processing steps create an effect of old-style photo - picture first is turned into sepia colors (gradients of brownish), then vertical grain noise is added, vignetting effect to make picture darker on its edges and finally two borders are added: a fuzzy border and a rounded border.
Running a sandbox with configuration similar to the above, may result in the picture below (provided plug-ins are configured appropriately):
In some cases, it may be useful to check performance of the configured video processing graph and/or change properties of some of its steps. This can be done from the Video processing information form available from running camera's context menu. This form shows average time taken by each video processing step and percentage of the total time taken by the graph. This information may help troubleshooting the configured graph, finding which steps take most of CPU time and potentially cause video processing delays (we usually don't want graph's total time to be greater than video source's frame interval time). The same form also allows changing properties of image processing plug-ins (if they have any) and see the effect of it on the running video.
In addition to image processing plug-ins, there are also some video processing plug-ins, which can be put into video processing graph. The difference between these two types of plug-ins may sound a bit subtle though. Image processing plug-ins are usually aimed to take input image and apply some image processing routine, which changes the image (or provide a new image as result). These plug-ins usually don't have state other than configured properties and most of the time, they apply the same routine to all images. Video processing plug-ins on the contrary may have some internal state, which may affect the way image is processed. Also, these plug-ins may or may not make any changes to source images – depends on what plug-in implements.
A good example of video processing plug-in is the Video File Writer plug-in, which writes incoming images into a video file. This plug-in can either write all images into a single file or it can be configured to split video into fragments of certain length (in minutes). In addition to that, it can monitor size of the destination folder and clean-up old files, which allows creating video archives. For example, the configuration below tells the video writing plug-in to write images into files prefixed with "street view" and suffixed with time stamp. Each video file should be 10 minutes long. And the size of destination folder should not exceed 10000Mb (~10Gb). Running a sandbox with video writing plug-in that way will ensure that we always have 10Gb worth of video archive for our camera.
Another video processing plug-in to mention is the Image Folder Writer plug-in. Unlike the video writing plug-in mentioned above, this one saves individual image as JPEG or PNG files into the specified folder at the configured time intervals. This can be used to create time-lapse images. For example, the configuration below will make the plug-in to write an image every 5 seconds (5000ms) and ignore/skip all other images.
Virtual Video Sources
Although we already described some video source plug-ins, it may be worth mentioning a bit about their sub-category, which is aimed for virtual video sources. Sub-category does not introduce a new type of plug-in. We still deal with video sources, which provide new images using the same interface as any other plug-in of this type. This sub-category is more for grouping similar plug-ins in UI, etc. The idea was to have some way of separating plug-ins which deal with video sources generated by some cameras/devices and some "virtual" video source like files, images, etc.
The first plug-in to mention in this category is Image Folder Video Source. As it was already mentioned above, using the Image Folder Writer plug-in, we can save images coming from some video source at certain time intervals. As a result, we get folder full of time stamped images. Now, suppose we may want to play them back, but with different interval. For example, we saved collection of time lapse images with 10 seconds interval, but then we want to play them at 30 frames per second rate. This is what the Image Folder Video Source does. When configuring this plug-in, it is required to specify source folder containing image files and desired time interval between frames. The plug-in will then read image files out of that folder and provide them as if they are video frames coming from some camera. Adding Video File Writer as a video processing step for this video source will allow stitching all those images into a single video file. This will give us a proper time lapse video file now!
Another virtual video source plug-in worth mentioning is Video Repeater. Really useful plug-in if used the right way. All the plug-in does is simply repeats/retranslates images pushed into it. Using it together with Video Repeater Push plug-in allows splitting video processing chain into multiple branches, which may apply different video processing steps to the original video frames. Let's see how we can use this plug-in.
First, we need to configure few video sources by using the Video Repeater plug-in. When configuring those, it is required to specify Repeater ID - something unique which will be used later by the Video Repeater Push plug-in to link with the video source. Opening these video sources on their own will not produce any result, but just "Waiting for video source ..." message. That is fine, as we have nothing pushing images into those yet. The next step is to configure a sandbox, which has a video source from some camera and few repeaters as well, three for example.
Now, using Sandbox Wizard, let's put three Video Repeater Push plug-ins as video processing steps for the camera video source, each configured with different ID we've used previously when configuring Video Repeater plug-ins.
Opening sandbox configured that way will show four video sources showing exactly the same video - one coming from a camera and the other three retranslating it. However, since Computer Vision Sandbox treats them as individual video sources, we can put any video processing we like on all four of them. For example, the screenshot below demonstrates a running sandbox, showing the original camera at the top left and then three repeaters, which apply different video processing steps to get different effects.
The above example demonstrates how to implement video processing branching using the idea of Video Repeater plug-in. But it is not the only use case for it. Suppose we've configured a lengthy video processing chain for some video source consisting of several steps. If we put Video Repeater Push plug-ins in between those steps and put few video repeaters into the sandbox, we will be able to see intermediate results of the performed video processing. In this use case, we use video repeaters only for displaying, instead of running video processing on top of them. This will get especially useful for debugging more complicated video processing done with the help of scripting.
Finally, video repeaters can be used to address some performance issues when dealing with heavy video processing chains. Suppose we have a video source providing images at 30 frames per second rate. Also suppose we put number of video processing steps, which all together take more time than the time interval between coming frames (~33ms). This may not be the configuration we want, as the resulting frame rate will drop due to the time-consuming video processing. The way around it could be doing only half of the video processing on the original video, then push whatever we have so far to video repeater and put the rest of video processing chain there, which will run on a different thread and so will not keep the original video source blocked. And so the performance issue is sorted giving us the frame rate we want! Just remember using the mentioned before Video processing information form to look for potential video processing bottlenecks.
The last virtual video source to mention is Screen Capture plug-in. As the name implies, the plug-in captures screen content at certain rate and provides it as images coming from a video source. It can be configured to capture specific screen available in the system, an area of certain size or a window with certain title, "Paint
" for example.
Scripting Plug-In
As was already demonstrated, adding different image/video processing plug-ins into video processing graph of a video source lets us implement different imaging effects, etc. However, there are certain limits to how much can be done with sequential pre-configured video processing graph. It does not let us change configuration of plug-ins during sandbox run time based on some logic. Also, it does not allow implementing more advanced video processing, where images could be analysed and something could be done based on that.
To allow more advanced video processing, Computer Vision Sandbox provides a Lua Scripting plug-in, which makes it possible to implement custom logic using Lua programming language. The scripting plug-in can be added into video processing graph in a similar way as other plug-ins and then configured to tell which script to run.
The project's web site provides complete documentation on the APIs provided by the Lua Scripting plug-in, as well as number of tutorials covering different use cases. Here, we'll demonstrate briefly few scripting examples to give some idea of what could be done.
To start, let's have a look at the simplest script, which uses Colorize image processing plug-in to change hue and saturation of image's pixels. If that plug-in was added directly into video processing graph, then user would need to configure its properties manually. And that would not change while sandbox is running unless user came back and reconfigured it. Using scripting, however, we can change plug-ins properties based on whatever logic we wish. Running the script below as video processing step, will keep changing hue value for the camera's images.
setHuePlugin = Host.CreatePluginInstance( 'Colorize' )
setHuePlugin:SetProperty( 'saturation', 100 )
hue = 0
function Main( )
image = Host.GetImage( )
setHuePlugin:SetProperty( 'hue', hue )
setHuePlugin:ProcessImageInPlace( image )
hue = ( hue + 1 ) % 360
end
As it can be seen from the code above, the script has two parts: a global part and the Main()
function. The global part is aimed to perform whatever needed initialization and is executed only once when sandbox is started. The Main()
function is then executed for every new frame generated by video source. Using the Host.GetImage()
API we can get access to the image currently handled by video processing graph and then apply different image processing routines to it.
A slightly larger script to take a look at uses five different plug-ins to create effect of old video. It was already demonstrated how to create such effect by using those plug-ins directly within video processing graph. But now, we may want to make it more dynamic, so that amount of vignetting and added noise changes between video frames.
local math = require "math"
sepiaPlugin = Host.CreatePluginInstance( 'Sepia' )
vignettingPlugin = Host.CreatePluginInstance( 'Vignetting' )
grainPlugin = Host.CreatePluginInstance( 'Grain' )
noisePlugin = Host.CreatePluginInstance( 'UniformAdditiveNoise' )
borderPlugin = Host.CreatePluginInstance( 'FuzzyBorder' )
vignettingStartFactor = 80
grainSpacing = 40
noiseAmplitude = 20
vignettingPlugin:SetProperty( 'decreaseSaturation', false )
vignettingPlugin:SetProperty( 'startFactor', vignettingStartFactor )
vignettingPlugin:SetProperty( 'endFactor', 150 )
grainPlugin:SetProperty( 'staticSeed', true )
grainPlugin:SetProperty( 'density', 0.5 )
borderPlugin:SetProperty( 'borderColor', '000000' )
borderPlugin:SetProperty( 'borderWidth', 32 )
borderPlugin:SetProperty( 'waviness', 8 )
borderPlugin:SetProperty( 'gradientWidth', 16 )
seed = 0
counter = 0
function Main( )
RandomizeIt( )
image = Host.GetImage( )
sepiaPlugin:ProcessImageInPlace( image )
vignettingPlugin:ProcessImageInPlace( image )
grainPlugin:ProcessImageInPlace( image )
noisePlugin:ProcessImageInPlace( image )
borderPlugin:ProcessImageInPlace( image )
end
function CheckRange( value, min, max )
if value < min then value = min end
if value > max then value = max end
return value
end
function RandomizeIt( )
vignettingStartFactor = CheckRange( vignettingStartFactor +
math.random( 3 ) - 2, 60, 100 )
vignettingPlugin:SetProperty( 'startFactor', vignettingStartFactor )
noiseAmplitude = CheckRange( noiseAmplitude +
math.random( 5 ) - 3, 10, 30 )
noisePlugin:SetProperty( 'amplitude', noiseAmplitude )
counter = ( counter + 1 ) % 5
if counter == 0 then
seed = seed + 1
grainPlugin:SetProperty( 'seedValue', seed )
grainSpacing = CheckRange( grainSpacing + math.random( 5 ) - 3, 30, 50 )
end
end
OK, enough with imaging effects. Let's try something different instead. For example, let's try making a simple motion detector. The script below uses the Diff Images Thresholded plug-in to find number of pixels, which differ by a certain amount in two consecutive images. If the difference amount is higher than certain threshold, it is triggered as motion by highlighting the area and adding red rectangle around the image. A logical extension to the script would be to start writing video file when motion is detected, instead of saving everything in the video archive like it was demonstrated before.
diffImages = Host.CreatePluginInstance( 'DiffImagesThresholded' )
addImages = Host.CreatePluginInstance( 'AddImages' )
imageDrawing = Host.CreatePluginInstance( 'ImageDrawing' )
diffImages:SetProperty( 'threshold', 60 )
diffImages:SetProperty( 'hiColor', 'FF0000' )
addImages:SetProperty( 'factor', 0.3 )
motionThreshold = 0.1
highlightMotion = true
function Main( )
image = Host.GetImage( )
if oldImage ~= nil then
diff = diffImages:ProcessImage( image, oldImage )
oldImage:Release( )
oldImage = image:Clone( )
diffPixels = diffImages:GetProperty( 'diffPixels' )
diffPercent = diffPixels * 100 / ( image:Width( ) * image:Height( ) )
imageDrawing:CallFunction( 'DrawText', image, tostring( diffPercent ),
{ 1, 1 }, 'FFFFFF', '00000000' )
if diffPercent > motionThreshold then
imageDrawing:CallFunction( 'DrawRectangle', image,
{ 0, 0 }, { image:Width( ) - 1, image:Height( ) - 1 }, 'FF0000' )
if highlightMotion then
addImages:ProcessImageInPlace( image, diff )
end
end
diff:Release( )
else
oldImage = image:Clone( )
end
end
And here is an example of how it may look like when motion is detected.
Another interesting example to show is a script to look for round objects. It uses a plug-in, which finds individual blobs (objects) in an image and checks if they have circular shape. Before doing blobs' processing, we need to do segmentation though - separate background from foreground. In this case, the script uses simple thresholding technique. This puts a restriction that our image must have dark even background and brighter objects. Which is fine for this example.
local math = require "math"
local string = require "string"
grayscalePlugin = Host.CreatePluginInstance( 'Grayscale' )
thresholdPlugin = Host.CreatePluginInstance( 'Threshold' )
circlesFilterPlugin = Host.CreatePluginInstance( 'FilterCircleBlobs' )
drawingPlugin = Host.CreatePluginInstance( 'ImageDrawing' )
thresholdPlugin:SetProperty( 'threshold', 64 )
circlesFilterPlugin:SetProperty( 'filterImage', false )
circlesFilterPlugin:SetProperty( 'minRadius', 5 )
drawingColor = '00FF00'
function Main( )
image = Host.GetImage( )
grayImage = grayscalePlugin:ProcessImage( image )
thresholdPlugin:ProcessImageInPlace( grayImage )
circlesFilterPlugin:ProcessImageInPlace( grayImage )
circlesFound = circlesFilterPlugin:GetProperty( 'circlesFound' )
circlesCenters = circlesFilterPlugin:GetProperty( 'circlesCenters' )
circlesRadiuses = circlesFilterPlugin:GetProperty( 'circlesRadiuses' )
drawingPlugin:CallFunction( 'DrawText', image, 'Circles: ' .. tostring( circlesFound ),
{ 5, 5 }, drawingColor, '00000000' )
for i = 1, circlesFound do
center = { math.floor( circlesCenters[i][1] ), math.floor( circlesCenters[i][2] ) }
radius = math.floor( circlesRadiuses[i] )
dist = math.floor( math.sqrt( radius * radius / 2 ) )
lineStart = { center[1] + radius, center[2] - radius }
lineEnd = { center[1] + dist, center[2] - dist }
drawingPlugin:CallFunction( 'FillRing', image, center, radius + 2, radius, drawingColor )
drawingPlugin:CallFunction( 'DrawLine', image, lineStart, lineEnd, drawingColor )
drawingPlugin:CallFunction( 'DrawLine', image, lineStart,
{ lineStart[1] + 20, lineStart[2] }, drawingColor )
drawingPlugin:CallFunction( 'DrawText', image, tostring( radius ),
{ lineStart[1] + 2, lineStart[2] - 12 },
drawingColor, '00000000' )
end
grayImage:Release( )
end
The final scripting example shows how to use image exporting plug-ins and implement time lapse image writing as a Lua script. Yes, it was already demonstrated how to do that without the need of scripting – we have a dedicated plug-in for this to put directly into video processing graph. However, in case custom image saving logic is needed, it can be still of use.
local os = require "os"
folder = 'C:\\Temp\\images\\'
imageWriter = Host.CreatePluginInstance( 'PngExporter' )
ext = '.' .. imageWriter:SupportedExtensions( )[1]
imageInterval = 10
lastClock = -imageInterval
function Main( )
image = Host.GetImage( )
now = os.clock( )
if now - lastClock >= imageInterval then
lastClock = now
SaveImage( image )
end
end
function SaveImage( image )
dateTime = os.date( '%Y-%m-%d %H-%M-%S' )
fileName = folder .. dateTime .. ext
imageWriter:ExportImage( fileName, image )
end
There are more scripting examples available and those are included into the official installation package of Computer Vision Sandbox or can be found on the project's web page. Together with Lua scripting API description and different tutorials, they may provide in depth coverage of available features.
Device Plug-Ins
When scripting plug-in was introduced, which lets implementing more advanced video processing, the next step forward was to implement support for interaction with different devices. The idea was to let scripts to interact with the real word - change video processing routine based on some device's inputs or set device's outputs/actuators based on results of image processing algorithm. As a result, two new plug-in types were added – device plug-ins and communication device plug-ins. These allow adding support for communication with external devices, like different I/O boards, robotics controllers, devices attached to serial port, etc. As it is one of the recent features added, there are not many plug-ins of these types available so far. More will be added as the project evolves.
Although both new plug-in types are aimed for communication with external devices, they provide slightly different API, which allows device interaction in different ways. Device plug-ins hide all communication details/protocols and allow talking to devices by means of setting/getting plug-in's properties. For example, if we have some digital I/O board, setting its outputs can be implemented by setting some properties of the plug-in. And querying state of its output pins can be implemented by reading properties. In some cases, such interface may be too limited though and more flexibility is needed. Communication Device plug-ins extend the API and provide Read/Write methods, which allow sending raw data to device using whatever protocol it supports. Let's have a look at few examples of interaction with some devices.
The first plug-in to demonstrate is the Gamepad device, which can be used in a number of applications. For example, if some camera is mounted on a pan/tilt device, that could be controlled with the help of a gamepad
. Or it can be used to control some robot, video processing sequence, etc.
local math = require 'math'
gamepad = Host.CreatePluginInstance( 'Gamepad' )
gamepad:SetProperty( 'deviceId', 0 )
if not gamepad:Connect( ) then
error( 'Failed connecting to game pad' )
end
deviceName = gamepad:GetProperty( 'deviceName' )
axesCount = gamepad:GetProperty( 'axesCount' )
buttonsCount = gamepad:GetProperty( 'buttonsCount' )
function Main( )
axesValues = gamepad:GetProperty( 'axesValues' )
buttonsState = gamepad:GetProperty( 'buttonsState' )
print( 'X: ' .. tostring( math.floor( axesValues[1] * 100 ) / 100 ) )
print( 'Y: ' .. tostring( math.floor( axesValues[2] * 100 ) / 100 ) )
x = gamepad:GetProperty( 'axesValues', 1 )
buttonState1 = gamepad:GetProperty( 'buttonsState', 1 )
if buttonState1 then
print( "Button 1 is ON" )
else
print( "Button 1 is OFF" )
end
end
To craft your own pan/tilt device, a Phidget Advanced Servo board can be used. A plug-in for this is not included into official installation package, but can be obtained separately from GitHub. Once added into Computer Vision Sandbox, it can be used either on its own to control servos or with the above mentioned gamepad
device plug-in.
servos = Host.CreatePluginInstance( 'PhidgetAdvancedServo' )
if not servos:Connect( ) then
error( 'Failed connecting to servo board' )
end
motorCount = servos:GetProperty( 'motorCount' )
servos:SetProperty( 'velocityLimit', { 2, 2 } )
servos:SetProperty( 'acceleration', { 20, 20 } )
servos:SetProperty( 'positionRange', { { 105, 115 }, { 135, 145 } } )
servos:SetProperty( 'engaged', { true, true } )
servos:SetProperty( 'targetPosition', { 110, 140 } )
function Main( )
actualPosition = servos:GetProperty( 'actualPosition' )
stopped = servos:GetProperty( 'stopped' )
servos:SetProperty( 'targetPosition', 1, 115 )
servos:SetProperty( 'targetPosition', 2, 135 )
end
Another supported device from the same manufacturer is Phidget Interface Kit, which allows interacting with digital inputs/outputs and with analog inputs. For example, it can be possible to control video processing routine depending on the state of inputs. Or control devices connected to digital outputs depending on what is detected in the video stream.
kit = Host.CreatePluginInstance( 'PhidgetInterfaceKit' )
if not kit:Connect( ) then
error( 'Failed connecting to interface kit board' )
end
digitalInputCount = kit:GetProperty( 'digitalInputCount' )
digitalOutputCount = kit:GetProperty( 'digitalOutputCount' )
analogInputCount = kit:GetProperty( 'analogInputCount' )
kit:SetProperty( 'digitalOutputs', { false, false, false, false,
false, false, false, false } )
function Main( )
kit:SetProperty( 'digitalOutputs', { true, true } )
kit:SetProperty( 'digitalOutputs', 7, true )
analogInputs = kit:GetProperty( 'analogInputs' )
digitalInputs = kit:GetProperty( 'digitalInputs' )
for i = 1, #analogInputs do
print( 'Analog input', i, 'is', analogInputs[i] )
end
for i = 1, #digitalInputs do
print( 'Digital input', i, 'is', digitalInputs[i] )
end
end
Now, suppose we have something connected over serial port which implements some specific communication protocol. For example, it can be an Arduino board running some sketch, which allows controlling some of its electronics by sending commands over serial interface. For this, we can use the Serial Port communication device plug-in and implement the supported protocol by using Read/Write API. For example, the script below demonstrates communication with an Arduino device to switch LED on/off and query push button's state (it is assumed the Arduino board is running sample sketch from here).
local string = require 'string'
serialPort = Host.CreatePluginInstance( 'SerialPort' )
serialPort:SetProperty( 'portName', 'COM8' )
serialPort:SetProperty( 'blockingInput', true )
serialPort:SetProperty( 'ioTimeoutConstant', 50 )
serialPort:SetProperty( 'ioTimeoutMultiplier', 0 )
function Main()
if serialPort:Connect( ) then
print( 'Connected' )
print( 'IsConnected: ' .. tostring( serialPort:IsConnected( ) ) )
sleep( 1500 )
sent, status = serialPort:WriteString( 'led_on\n' )
print( 'status: ' .. tostring( status ) )
print( 'sent : ' .. tostring( sent ) )
strRead, status = serialPort:ReadString( 10 )
print( 'status : ' .. tostring( status ) )
print( 'str read: ' .. strRead )
sent, status = serialPort:Write( { 0x6C, 0x65, 0x64, 0x5F, 0x6F, 0x66, 0x66, 0x0A } )
print( 'status: ' .. tostring( status ) )
print( 'sent : ' .. tostring( sent ) )
readBuffer, status = serialPort:Read( 10 )
print( 'status : ' .. tostring( status ) )
print( 'bytes read: ' )
for i=1, #readBuffer do
print( '[', i, ']=', readBuffer[i] )
end
sent, status = serialPort:WriteString( 'btn_state\n' )
print( 'status: ' .. tostring( status ) )
print( 'sent : ' .. tostring( sent ) )
strRead, status = serialPort:ReadString( 10 )
print( 'status : ' .. tostring( status ) )
print( 'str read: ' .. strRead )
if string.sub( strRead, 1, 1 ) == '1' then
print( 'button is ON' )
else
print( 'button is OFF' )
end
print( 'Testing timeout' )
strRead, status = serialPort:ReadString( 10 )
print( 'status : ' .. tostring( status ) )
print( 'str read: ' .. strRead )
serialPort:Disconnect( )
end
end
As we can see, adding support for device plug-ins expands the range of applications for the Computer Vision Sandbox. Indeed, there are many interesting ways of combining video processing and computer vision with different available devices.
Sandbox Scripting Threads
As it was just demonstrated in the previous chapter, device plug-ins allow talking to a variety of different devices making it possible to create more interactive applications. Interaction with different devices can be done from the same scripts as those used to perform some video processing. In many cases, however, it is preferred to put communication with devices into separate scripts instead of doing it from video processing scripts. There are number of reasons for that. First, there may not be relation with performed video processing at all. For example, a pan/tilt device may move the camera at certain time intervals, which don't depend on results of video processing algorithms. Or, robot's movement can be controlled based on inputs from another device. Second, very often it is preferred to complete video processing as soon as possible, so that video source does not get blocked. Communication with some devices, however, may involve certain delays caused by connection speed, protocols in use, etc. Another reason could be requirement to interact with certain devices at time intervals, which are not based on video source's frame rate, i.e., have more frequent interactions with some devices and less frequent with others.
To address the need of running some scripts independent from video processing, Computer Vision Sandbox has a concept of sandbox scripting threads. Sandbox wizard allows not only configuring which video processing steps to run for each camera within a sandbox, but also create additional threads, which run specified scripts at set time intervals. For example, the screenshot below demonstrates a possible set-up for controlling the PiRex robot. The first thread runs a script for controlling robot's motors based of game pad's input. To make the robot responsive enough, the thread runs control script at 10 milliseconds intervals. The second thread runs a different script, which queries distance measurements provided by robot's ultrasonic sensor. As it is mostly informational, we chose to run it 10 times a second, i.e., at 100 milliseconds intervals.
The scripts running within sandbox threads have very similar structure to those used to perform video processing on camera's images. They have a global section and a Main()
function. The global section is executed once, when sandbox gets started (before starting any video sources). And the Main()
function is executed again and again at the configured time intervals.
Let's have a look at potential implementation of the scripts used for the above shown sandbox threads. The first script does robot's control - changes motors' power based on game pad's input. It has nothing to do with the video coming from robot's camera and we want to run it at a higher rate than camera's FPS. Looks like a perfect candidate to run on its own in a sandbox thread. All it does is reading values of game pad's axes, converting those into motors' power values and sending them to the robot so it performs desired movement.
local math = require 'math'
gamepadPlugin = Host.CreatePluginInstance( 'Gamepad' )
pirexPlugin = Host.CreatePluginInstance( 'PiRexBot' )
prevLeftPower = 1000
prevRightPower = 1000
gamepadPlugin:SetProperty( 'deviceId', 0 )
gamepadPlugin:Connect( )
pirexPlugin:SetProperty( 'address', '192.168.0.12' )
pirexPlugin:Connect( )
function Main( )
axesValues = gamepadPlugin:GetProperty( 'axesValues' )
leftPower = 0 - math.floor( axesValues[2] * 100 )
rightPower = 0 - math.floor( axesValues[3] * 100 )
if ( math.abs( prevLeftPower - leftPower ) ) then
pirexPlugin:SetProperty( 'leftMotor', leftPower )
end
if ( math.abs( prevRightPower - rightPower ) ) then
pirexPlugin:SetProperty( 'rightMotor', rightPower )
end
prevLeftPower = leftPower
prevRightPower = rightPower
end
The second script we have runs at 100 milliseconds intervals and is used to read distance measurements provided by robot's ultrasonic sensor. There is not much we'll do about it, but just display it to user directly on the video coming from robot's camera. This requires some image processing (drawing) for displaying the distance to obstacles, which means we could put the code for reading the sensor into the script doing camera's video processing. However, as mentioned before, sensor reading may cause certain delays and we don't really want to introduce those into video processing. So, we'll separate sensor reading and measurement displaying into two scripts, which communicate by using host variables.
local string = require 'string'
pirexPlugin = Host.CreatePluginInstance( 'PiRexBot' )
pirexPlugin:SetProperty( 'address', '192.168.0.12' )
pirexPlugin:Connect( )
function Main( )
distance = pirexPlugin:GetProperty( 'obstacleDistance' )
Host.SetVariable( 'obstacleDistance', string.format( '%.2f', distance ) )
end
As we can see from above, the script only reads distance measurements and puts those into a host variable - nothing more. Obviously, this will not display anything to user, but this is where video processing script comes into play. Among other things we may want to do with images coming from robot's camera, we can also output the distance measurement, which can be retrieved from the host variable.
drawing = Host.CreatePluginInstance( 'ImageDrawing' )
function Main( )
image = Host.GetImage( )
distance = Host.GetVariable( 'obstacleDistance' )
drawing:CallFunction( 'DrawText', image, 'Distance : ' .. distance,
{ 10, 10 }, '00FF00', '00000000' )
end
The above use case demonstrates usage of sandbox threads and how those can be used to perform certain actions at configured time intervals. All scripts (threading or video processing) running within a sandbox can communicate by setting/reading host variables. This may allow different scenarios. A video processing routine can be driven by reading some sensors. However, an opposite can be done as well, i.e., a video processing script may set some variables based on the results of applied algorithms and then a threading script can read those variables and drive some device's actuators.
To demonstrate all the above in action with some video processing on top, here is a short video of the PiRex robot controlled with gamepad to hunt for hidden glyphs.
Project's Code
The entire project’s source code is available in its GitHub repository. The code is primarily developed in C/C++ to get most out of the available resources and provide reasonable performance. It was also developed with the idea of being portable, so that eventually it could be built for other than just Windows platforms. In the early stages, it was really so with tests running on both Window and Linux, but then more effort was put into getting something out and running. So, for now only a Window installation package is provided, while support for other platforms potentially coming in future releases.
The Computer Vision Sandbox project uses number of open source components to do image/video decoding/encoding, provide scripting capabilities, built-in editor, etc. In addition to those, it also uses Qt Framework to get cross-platform user interface.
Building of the project is done in two stages. The first part is to build all external components. This usually needs to be done once to get required libraries and binaries of all dependencies. Then the project's code itself can be built. It is possible to either build everything by running a single script or build individual components as needed, which is a common case when developing new features. Two tool chains are currently supported by the project – Visual Studio 2015 (Community Edition will work fine) and MinGW. VS is mostly used for development/debugging, while all official releases are done with MinGW till now.
The source code of the project has grown quite substantially over the last few years, so describing its details in a single article may not be feasible by now. The foundation of it is provided by "afx
" libraries, which provide common types and functions, including image processing algorithms and access to some video sources. Then a set of core libraries define interfaces for plug-ins, their management, scripting and the backbone for running video processing sandboxes. A good collection of plug-ins implement those interfaces providing variety of video sources, image/video processing routines, image importing/exporting, devices, etc. Finally, some applications are provided. The main one of them is the Computer Vision Sandbox, which has been described in this article. Another useful one is Computer Vision Sandbox Scripts Runner (cvssr), which is a command line tool to run some simple scripts for image processing, interaction with devices, etc. The provided collection of plug-ins can be potentially re-used in other applications as well, as the project provides C++ library for their loading and management.
Conclusion
Well, this is it for now about the Computer Vision Sandbox project. Although the article may not provide detailed description of every single feature implemented in the project, it does provide a good review of the key features and how to use them for different applications. The project's web site provides additional tutorials describing the rest of the features in detail and giving more examples of how to use them.
As it was stated in the beginning, the idea was to build a project, which allows implementing different applications from various areas of computer vision. It was made very modular, so that individual features are delivered as plug-ins. Depending on the type of plug-ins and the way those are combined, a very different result can be achieved. And if a new camera, image processing routine, device, etc. need to be supported – just add a new plug-in, no need to go deep into the main application's code.
Doing different computer vision related projects in the past, I usually ended up making a new application for a new project. Now, however, I try to do it just as a script. And if I find something missing, then developing a new plug-in. No more additional applications for different things, one is enough.
Being available for number of years (although not in open source shape), the project was used successfully to implement different applications. Some of them as different hobby projects. But some are in the areas of process automation used in labs/production. Hope the Computer Vision Sandbox project will continue to evolve and more and more interesting applications can be developed based on it.