Machine vision in everyday life
After a long time in which I have had to dedicate myself to developments and projects with different clients, I return to the blog with a post that compiles a series of articles published on LinkedIn, but in Spanish language, focused on how machine vision can be found in many aspects of daily life. The topics to be covered in each article are: Machine vision in everyday activities, Machine vision in social networks and Machine vision in sports and sports broadcasting.
Machine vision in day-to-day activities
All the developments that we have come across over the years affect our daily lives to a greater or lesser extent, even if our work has nothing to do with industry. For example, the Edge Computing-Artificial Vision tandem is one that we have been using for more than a decade and that we usually tend to carry in our pockets. In fact, one definition of Edge Computing is bringing computational power and analytics closer to where data is generated, solving latency or network accessibility problems. Therefore, although there are many services that continue to use the cloud, the truth is that we have great power and all the elements that make up a machine vision system in the mobile.
As an engineer and developer, I usually work with cameras, lighting and algorithms, but I have realized that machine vision is no longer something that is confined to the industrial sector, where it has been applied in automated processes for some time, but in the last decade, thanks to cheaper hardware, the exponential increase in power and (less recently) the emergence of artificial intelligence, this technology has become part of our daily lives and accompanies us almost without us realizing it. Moreover, greater ease of development has extended its application to new sectors.
Of course, we are not only talking about the software that modifies the images, but also about the hardware, as we must not forget that machine vision does not only include algorithms or code, elements such as the sensor or optics are indispensable for its existence. Let’s mention a few examples that we can find throughout, for example, today:
- The ever-increasing quality of the camera in mobile phones. Without their current capacity it would not be possible to obtain such good results when applying the different algorithms. But it is not only the resolution of the cameras, integrated optics or greater sensor sensitivity have given mobiles a great ability to obtain high quality images.
- Augmented reality. Although I find many examples focused on games, this technology is still the application of many existing concepts in machine vision, and is one of the most promising technologies and greatest potential that we can find today.
- Facial recognition. This technology, usually thanks to AI, is used on the acquisition of images in many cameras or also for the instant unlocking of mobiles or computers.
- Photo enhancement on social networks. This is something that millions of people do every day, they take a photo and apply a filter to it to share it on social networks in different styles. What is this filter if not an algorithm that modifies the pixels of our image and converts them to obtain another image with a series of modifications?
- Image translation applications. Google Translate may be the most famous, but there are several similar applications. Simplifying a little, what we find here is a hyper-vitaminized version of an OCR (Optical Character Recognition).
- Sports broadcasts. If we are watching a football match, we find a superimposition of images to show us information about the match (players’ movements, strategies…) or advertising elements perfectly integrated with the environment. Perhaps we have not realized it, but the software of an artificial vision system comes into play.
- Autonomous vehicles. Although it is not yet very common on our streets, it is very recognizable, integrating an enormous amount of artificial vision, both in the number of cameras installed and, in the algorithms, responsible for recognizing the different elements of the roads.
Machine vision in social networks
Following the script that brings us closer to the use of machine vision in more situations than we thought, we are going to talk a little in this section about how image analysis is used in social networks. Although in this case we will see that most of the effects or modifications use artificial intelligence to obtain the desired result.
Here we can find several elements. For example, the mobile is able to detect the characteristics of our face through facial recognition and know the position of eyes, nose and face and also do it in real time. In this way, it is able to modify the photo by overlaying different images or enlarging and reducing the features of our face automatically.
Who hasn’t taken a selfie using any of the filters available in the different apps? There are countless examples such as the so-called Bokeh effect, which is the generation of blurred photographic backgrounds. In the case of traditional cameras, the effect is physical, but in smartphones, this effect is generated by an algorithm. It is one of the examples in which the effects that were achieved by the cameras themselves are now achieved in a simple way by algorithms that modify the images.
Another case is the avatar effect, the generation of animated avatars from images of real people. Currently, it is widely used in social networks.
Artificial Intelligence is also applied, in this case Deep Learning, to virtually “age” a person, showing how they might look in the future. In fact, it was an example that went viral a few years ago.
Technology companies themselves have tools that facilitate this analysis. One example is Google’s tool, Cloud Vision, an API (Application Programming Interface) that allows the analysis of hundreds of images and helps with the extraction of information or with the detection of objects or different characteristics. Among its possibilities we have:
- Face Detection can detect faces, reference points on them (eyes, mouth…) and even infer people’s emotions.
- Text Detection is used in many applications to obtain text from the image by using OCR (Optical Character Recognition).
- Safe Search Detection, algorithm used to detect inappropriate content in images.
There are also more interesting possibilities. For example, there are certain accessibility features that recognise what is in the images (automatically classifying faces, expressions, objects, company logos…) and bring it to the user’s attention by voice, using object recognition technology with a trained model. Very useful for the visually impaired.
Apart from the uses within the network itself, there is also a large amount of scientific research that uses all the images from social networks: research on obesity, analysis of images published during a crisis or catastrophe or the ability to detect the mood of people. And why is this? Because most of the images that are shared over the internet are embedded within some type of social network.
There are also software tools on the mobile itself, without the use of the cloud, such as TopShot, an algorithm that through artificial intelligence is able to detect which is the best photo that has been acquired within a burst, taking into account things like whether the image is blurred or whether the people in the image are smiling or have their eyes closed.
Machine vision in sports and sports broadcasting
Taking advantage of the first topic, we have been able to see the introduction of machine vision software in our daily lives and one of the most common cases are sports broadcasts, in which Artificial Intelligence, specifically Deep Learning, is normally applied, thanks above all to the huge amount of existing data in recent decades, with thousands of hours of recorded matches of different sports that serve as datasets for Artificial Intelligence platforms.
The current need to display and consume data, this hunger for information and the need for faster display is a breeding ground for the use of artificial vision in this sector: Tracking of players or objects, displaying information (as distances or speeds) and statistics throughout the match are just a few examples of the possibilities.
Another use within broadcasts is the inclusion of advertising blending in with the event in a non-intrusive way. In fact, advertising is different depending on the country in which it is broadcasted.
The fact is that it is not only applicable to the broadcasts themselves, but also to the improvement of tactics and strategy, or the improvement of technique in both, individual or team sports. Safety is also an objective for such developments, by monitoring spectators and detecting strange behaviour.
Furthermore, its use is not only focused on showing analysis, movements or trajectories of players, but it is also a key element in current refereeing in the case of football through the use of VAR (Video Assisted Refereeing), which, although always susceptible to human decisions, does provide the degree of knowledge necessary to check historically controversial and difficult to decide rules such as offside.
I leave you a link with Open Access papers focused on sports. As a curiosity, there is even a paper focused on ESports, specifically on the video game “League Of Legends”: Computer Vision in Sports – Open Access
Conclusion
Throughout the post I have mentioned a series of examples that have occurred to me for each case, but I am sure there are more that we can find. Can you think of more possibilities?
In addition, in the case of sporting activities, a series of questions arise for the future: Will the day come when a human referee is no longer necessary in decision-making? Or even in the training of an athlete? What do you think?