Neural networks taught to recognize similar objects on videos without accuracy degradation

Andrey Savchenko, Professor at the Higher School of Economics (HSE University), has developed a method that can help to enhance image identification on videos. In his project, a network was taught by a new algorithm and can now make decisions on image recognition and classification at a rate 10 times faster than before. This research was presented in the paper 'Sequential three-way decisions in multi-category image recognition with deep features based on distance factor' published in Information Sciences.

The neural networks learned to identify humans and animals in videos a long time ago. Artificial neurons can learn by remembering what a certain object looks like in an image. Usually, researchers take an open database of photos (e.g., ImageNet, Places, etc.) and use it to teach a neural network. To speed up the decision-making process, our algorithm is set to pick only some of the sample images, or focus on a limited number of traits. Complications may arise when objects of different classes are in the same photo, and there are only small number of training examples for each category.

The new algorithm now can recognize images without significant accuracy degradation through the application of a sequential three-way decision-making method. By employing this approach, a neural network can analyze simple images in one way for clearly recognizable objects, while objects that are difficult to identify can be given a more detailed examination.

'Each photo can be described by literally thousands of features. So, it wouldn't make much sense to compare all of the features of a given input image with those of a basic training example, since most samples would not be similar to the analyzed image. So, we initially only compared just a few of the important features, and put aside the training instances, which obviously cannot be treated as final solutions. As a result, the training sample becomes smaller and only a few examples are left. At the next stage, we would increase the number of features for the remaining images, and then repeat this process until only one class is left,' Prof. Savchenko noted.

This approach reduced the time for recognition by 1.5 to 10 times, as compared to regular classifiers and known multi-category sequential three-way decisions. As a result, this technology could be used in future on mobile devices and other basic gadgets.

Credit:

National Research University Higher School of Economics