Detecting Pests with a Motion Detector and Trained DNN

Sergey L. Gladkiy

5.00/5 (5 votes)

23 Dec 2020CPOL4 min read

11.1K

140

In this article, we’ll show you how to develop a simple motion detector and combine it with the trained DNN model to detect moose on video.

Here we provide and explain the Python code for a simple Motion Detector developed with OpenCV. Then we embed the trained DNN into the motion detector for detecting moose on video.

Introduction

Unruly wildlife can be a pain for businesses and homeowners alike. Animals like deer, moose, and even cats can cause damage to gardens, crops, and property.

In this article series, we’ll demonstrate how to detect pests (such as a moose) in real time (or near-real time) on a Raspberry Pi and then take action to get rid of the pest. Since we don’t want to cause any harm, we’ll focus on scaring the pest away by playing a loud noise.

You are welcome to download the source code of the project. We are assuming that you are familiar with Python and have a basic understanding of how neural networks work.

In the previous article, we developed and trained a simple classifier DNN model to predict moose appearance in video frames. The model accuracy tested at 97%, which seems to be good enough for our purposes. In this article, we’ll explain how to develop a motion detector to localize segments of interest in the video stream, and how to combine this detector with the classifier to catch moose.

Detecting Motion

Creating a basic motion detector using the OpenCV library is not very complex. In most cases, we can use background subtractors to implement motion detection. A background subtraction algorithm sorts all pixels in a frame into two subsets: background (static scene pixels) and foreground (pixels that dynamically change when an object appears in the frame). Here is what a simple motion detector looks like in code:

Python

class MD:
    def __init__(self, min_w, min_h):
        self.proc_width = 320
        self.proc_height = 240
        self.min_width = min_w
        self.min_height = min_h
        self.motion_objects = []
        self.subtractor = cv2.createBackgroundSubtractorMOG2(history=300)
        self.subtractor.setBackgroundRatio(0.005)
        self.fg_mask = None
        self.frame_count = 0
        
    def process(self, frame):
        p_frame = cv2.resize(frame, (self.proc_width, self.proc_height), cv2.INTER_AREA)
        
        bg_rate = 0.001
        if self.frame_count==0 :
            bg_rate = 1.0
        
        self.fg_mask = self.subtractor.apply(p_frame, bg_rate)
        
        if bg_rate>=1.0 :
            self.fg_mask[:] = 0
            self.motion_objects = {}
        else :
            self.fg_mask = self.cleanup(self.fg_mask)
            self.motion_objects = self.extract_objects(self.fg_mask, self.min_width, self.min_height)
            
        self.frame_count = self.frame_count+1

    def objects(self):
        return self.motion_objects
    
    def foreground(self):
        return self.fg_mask
    
    def cleanup(self, mask):
        (ret,mask) = cv2.threshold(mask,127,255,cv2.THRESH_BINARY)
        mask = cv2.medianBlur(mask, 5)
        (ret,mask) = cv2.threshold(mask,127,255,cv2.THRESH_BINARY)
        return mask
        
    def extract_objects(self, mask, min_w, min_h):
        (contimg, contours, hierarchy) = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        objects = []
        (h, w) = mask.shape
        for (i, contour) in enumerate(contours):
            (rx, ry, rw, rh) = cv2.boundingRect(contour)
            rx = rx/w
            ry = ry/h
            rw = rw/w
            rh = rh/h
            if (rw>=min_w) and (rh>=min_h) :
                rect = (rx, ry, rw, rh)
                objects.append(rect)
        
        return objects

On class initialization, we set up the minimum size of the objects to detect: min_w and min_h. It will help us reject fake "objects" that can appear due to the wind, sunlight changes, and so on.

The initialization routine specifies a processing size of 320 x 240 pixels. All processed frames will be resized to fit these dimensions to reduce the number of pixels to process, and thus increase the processing speed. Here, we also create the MOG2 background subtractor and initialize it with some parameter values. These values might need to be changed depending on the resolution of the video you’re working with.

The core method of the class, process, detects moving objects in a frame. It first resizes the frame to the processing size, then gets the foreground mask using the apply method of the subtractor, and finally extracts the bounding rectangles of the motion segments (foreground pixels).

Note that for the first processed frame we assign a value of 1.0 to the background update rate parameter, bg_rate, and then we change this value to 0.001. This low value is chosen specially for the testing video scene. It results in slow background updates, so that the detector has enough time to focus on the moving objects.

The utility method cleanup cleans up any noise resulting from the foreground mask. The extract_objects method evaluates the bounding boxes of the moving objects in relative coordinates. This is necessary because the sizes of the video frames are different from those of the processing frames.

We need a wrapper class to run our motion detector on a video file:

Python

class VideoMD:
    def __init__(self, md):
        self.md = md
        
    def play(self, file_path):
        capture = cv2.VideoCapture(file_path)
        
        fgd_name = 'Foreground'
        cv2.namedWindow(fgd_name, cv2.WINDOW_NORMAL)
        cv2.resizeWindow(fgd_name, 640, 480)
        
        md_name = 'Motion objects'
        cv2.namedWindow(md_name, cv2.WINDOW_NORMAL)
        cv2.resizeWindow(md_name, 640, 480)
       
        while(True):    
            (ret, frame) = capture.read()
            if frame is None:
                break
            
            self.md.process(frame)
            objects = self.md.objects()
            
            if len(objects)>0:
                Utils.draw_objects(objects, "OBJECT", (255, 0, 0), frame)
            
            # Display foreground
            fgd = self.md.foreground()
            cv2.imshow(fgd_name, fgd)
            
            # Display the resulting frame with object rects
            cv2.imshow(md_name, frame)
            
            time.sleep(0.040)
            
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
            
        capture.release()
        cv2.destroyAllWindows()

With the wrapper in place, we can launch the detector as follows:

Python

video_file = r"C:\PI_PEST\video\moose_1.mp4"

md = MD(0.05, 0.1)
v_md = VideoMD(md)
v_md.play(video_file)

For our video file, we get the following result:

Combining Motion Detector with DNN Model

Next, we need to combine our motion detection algorithm with the trained DNN model. Let’s start by creating the PestDetector class:

Python

class PestDetector:
    def __init__(self, proto, model, size):
        self.net = cv2.dnn.readNetFromCaffe(proto, model)
        self.size = size
    
    def get_blob(self, frame, obj):
        (h, w, c) = frame.shape
        (rx, ry, rw, rh) = obj
        rx = int(w*rx)
        ry = int(h*ry)
        rw = int(w*rw)
        rh = int(h*rh)
        
        if rh>rw :
            dx = int((rh-rw)/2)
            rx = rx-dx
            rw = rh
            if rx<0 :
                rx = 0
            if (rx+rw)>w :
                vx = w-(rx+rw)
                rw = rw - vx
                rh = rh - vx
        else :
            if rw>rh :
                dy = int((rw-rh)/2)
                ry = ry-dy
                rh = rw
                if ry<0 :
                    ry = 0
                if (ry+rh)>h :
                    vy = h-(ry+rh)
                    rh = rh - vy
                    rw = rw - vy
            
        img = frame[ry:ry+rh, rx:rx+rw]
        roi = (rx/w, ry/h, rw/w, rh/h)
            
        resized = cv2.resize(img, (self.size, self.size), cv2.INTER_AREA)
        blob = cv2.dnn.blobFromImage(resized, 1.0, (self.size, self.size), None, False, False)
        return (roi, blob)
    
    def detect(self, frame, obj):
        (roi, blob) = self.get_blob(frame, obj)
        self.net.setInput(blob)
        detection = self.net.forward()
        
        classes = detection.argmax(axis=1)
        class_num = classes[0]
        class_conf = detection[0][class_num]
        
        return (roi, (class_num, class_conf))

This class is similar to the SSD class we’ve discussed in the second article of this series. On initialization, it creates a neural network based on a specified DNN model. The detect method receives a frame and an object (a rectangle around a moving object if the frame) and determines the class of the object.

Note how the get_blob method of this class differs from the corresponding method of the SSD class. This method reshapes the rectangle segment to a square to meet the input requirements of the DNN classifier.

Next, we’ll slightly modify the VideoMD class to make it work for both motion detection and DNN classification:

Python

class VideoPD:
    def __init__(self, md, pd, thresh):
        self.md = md
        self.pd = pd
        self.thresh = thresh
        
    def play(self, file_path):
        capture = cv2.VideoCapture(file_path)
        
        md_name = 'Motion objects'
        cv2.namedWindow(md_name, cv2.WINDOW_NORMAL)
        cv2.resizeWindow(md_name, 640, 480)
       
        while(True):    
            (ret, frame) = capture.read()
            if frame is None:
                break
            
            self.md.process(frame)
            objects = self.md.objects()
            
            l = len(objects)
            if l>0:
                Utils.draw_objects(objects, "OBJECT", (255, 0, 0), frame)
            
            pests = []
            if l>0 :
                for (i, obj) in enumerate(objects) :
                    (roi, (class_num, class_conf)) = self.pd.detect(frame, obj)
                    if (class_num>0) and (class_conf>=self.thresh) :
                        pests.append(roi)
            
            k = len(pests)
            if k>0:
                Utils.draw_objects(pests, "PEST", (0, 0, 255), frame)
            
            # Display the resulting frame with object rects
            cv2.imshow(md_name, frame)
            
            time.sleep(0.040)
            
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
            
        capture.release()
        cv2.destroyAllWindows()

Finally, we can launch the moose detector with the following code:

Python

video_file = r"C:\PI_PEST\video\moose_1.mp4"

md = MD(0.05, 0.1)
proto = r"C:\PI_PEST\net\moose.prototxt"
model = r"C:\PI_PEST\net\moose.caffemodel"
pd = PestDetector(proto, model, 128)

v_pd = VideoPD(md, pd, 0.99)
v_pd.play(video_file)

Here is the resulting video:

As we can see, the solution we developed has managed to detect a moose virtually every time it appeared.

Next Steps

In the next article, we’ll test our detection algorithm on a Raspberry Pi 3 device and create the "scare pests away" part of our pest eliminator by playing a loud sound.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)