This final episode in the series looks a processing live streams of video. We examine labelling techniques that identify objects of interest in our training images.
This involves breaking down our training images into low resolution grids. In each grid cell, rather than having pixel information, we instead have a vector which contains the location, type and size of an object of interest in this grid cell.
As with all applications of neural networks we need a huge amount of training data, but with enough sample images we see how to train a network to learn the mapping between pixel intensities and this low resolution grid.
Once trained these networks are capable of real time object detection and tracking. In this video we look at tools that help us with the labeling process.
We look at the types of data structures that help us record the position, type and size of objects or interest. And we look at several examples of this approach to identify and track objects in real time on very modest hardware.