In today’s world, computers are our new eyes. We rely on these machines to react based on the appearance of a precise space. From animals to persons, to inanimate objects, image analytics in computers can detect far beyond simple items, but the variations of said object – in other words, a small step closer to that of a human.
In this blog, I will take you through the terminology, methodology, and applications of Image recognition in business applications. Here, we will also gain some valuable insight from Marcin Skoczylas PhD., regarding the state of the technology. Be sure to see his articles as well!
I can’t in good conscience let you walk away into the world without teaching you the differences between Image Detection, and Image Recognition.
That’s why I’d like to start with some definitions, to get that ‘image of the ball’ rolling.
The first step of the process. Only sees the existence and position of predefined objects on images.
- Example 1; Face detection tells us if there are faces visible on an image.
- Example 2; Checking inside factories, such as if electronic tracers on motherboards are uncut.
In practice, Image Detection is enough for many such operations, according to Marcin Skoczylas.
The second step. This function recognises objects on images. Objects are identified and assigned to a predefined class.
Facial recognition tells us –
- Match based on name / ID
How is it possible? Well, with algorithms, naturally. Machine learning plays a starring role as well. In all, it is a label for a process of training computers to ‘see like humans’, and process images.
The following is an overview of method variations, which I will list by levels of sophistication, in ascending order.
Simple Objects Detection
“That cat is back again”. This variant recognises an image based on general characteristics.
This variant analyses 2-dimensional imagery based on simple parameters. It operates, for example, in product label matching and packaging and works effectively with small databases
Image Recognition using Machine Learning
This refers to the practice of machine learning techniques, such as; Texture Classification, Convolutional Neural Network (CNN), and Fast Recurrent Neural Networks (Fast-RNN).
Advanced Image Recognition
This step is a more intricate-scale operation to the former option listed. Here, we apply Deep Learning, which requires an extensive database, with millions of objects. Google Images suits this example.
Most operations, even those with thousands of images, would not satisfy the requirements needed to execute this operation. These deep learning techniques operate as Machine Learning on a broader scale, i.e. CNN.
Current State of the Technology
We see it all over the news. Image recognition features in commercial, and even military contexts.
Google Search, however, is a primary example of its civilian application. The search engine allows you to input images as a search term. Law enforcement in China applies the technology to detect perpetrators of minor infractions, such as jaywalking.
However, where are we in terms of the progress of Image Recognition?
A major study carried out by Perficient Digital between April, and May 2019, analysed the technological capability of significant market players; Microsoft Azure, IBM Waterson, and Amazon Web Services’ Rekognition.
Here, 2000 images collected between 30th November 2018, and 8th January 2019, are divided into four classifications.
The study itself consisted of measurement of the accuracy of images vs categories allocated. Here, 7% of the 2000 total images were labelled precisely. A human control group, on the other hand, tagged the photos with 87.7% accuracy.
In terms of confidence level, or how satisfied the program is of it performing a correct image assignment, Microsoft Azure performed with an efficiency of 90.9%, with Google leading with 92.4%. Among the 3 test subjects, 55% of tags submitted cited a Confidence Level of 70% or higher. The humans, on the other hand, returned tags with total confidence of 90%.
According to Perficient Digital, precision is improving, as is the confidence levels within the machines. Compared to humans, though, technology remains inferior.
Applications & Implications
Here’s a snapshot of where these technologies are applied now.
Machine Learning improves the effectiveness of access controls, particularly for consumer applications, such as that of home security.
Retail assistance and automation are an area on the frontlines of potential improvement to the industry. Points of sale, as well as loss prevention, are areas slated for advancement.
Ideally, such tech applies to Pathology (Cancer Detection), and MRIs (Lesion Detections). In-fact, Google boasts an algorithm that can find signs of cancer on medical images. However, in the case of the United Kingdom, they are not permitted to present these algorithms due to the legal requirement of manual inspection.
Because the technology works on Neural Networks, which are a complex mathematical structure, it is not 100% possible to prove their effectiveness, according to Marcin Skoczylas. Therefore, at this point, even our most advanced technology cannot fulfil NHS requirements.
Image recognition assists throughout the manufacturing and assembly processes, as well as in the likes of quality control and assurance, in the detection of faulty products. The method uses deep learning to recognise defective products, as well as process optimisation, by reducing waste and increasing production by 20 and 50 per cent, respectively.
Image Analytics Examples
Facial recognition in-particular is very complex and is progressing beyond the 2-Dimensional scope of recognition. Using IR cameras, devices can now recognise the shape of one’s face and distinguish real from fiction.
Cloud APIs now offer prospective developers a multitude of options for developing image recognition functions to aid in retail operations. According to Marcin Skoczylas, Google Vision API is powerful for rapid analysis of objects visible within images and presents them in simple-to-use JSON files.
Smartphone Face Unlock
The smartphones now available on the market boast a plethora of technologies in image recognition. The tech was first used to suggest image categories in in-built photo album programmes before becoming an intuitive tool used for unlocking and navigating the device itself.
Facial Recognition in Retail POS
From December, the Wedome Bakery Chain in China piloted the ‘Smile to Pay’ function, enabling users at points-of-sale to identify themselves via facial recognition, and pay using an associated digital wallet, which is protected by SMS Authentication, and ‘risky environment’ detection.
Using a similar method to Google Image search, retailers now open the doors for ‘backwards’ searches of fashion items, aiding in retrieving fashion designs. Such is particularly useful in cases of unpronounceable or linguistically-challenging furniture and clothing items.
There exist reasons for concern, citing a lack of progress in image recognition. Opinions within scientific, legislative, and the public are mixed, with some studies appearing to suggest room for improvement.
Excavating AI; Politics of Images in Machine Learning Training Sets, a study by Prof. Kate Crawford, AI researcher, and Trevor Paglen, artist and researcher ran images through ImageNet, a 14,000,000+ image neural network, in September 2019.
The images processed were assigned tags that yielded peculiar, and sometimes startling assumptions and generalisations, as demonstrated in the cited descriptions.
As a case-in-point, I gave the program a try, using the following images;
In the case of the more peculiar tags, I have included the on-site provided definitions of said terms.
Microsoft Founder, Bill Gates (Front Profile)
Professional Tennis Player, Serena Williams
Conclusively, the results demonstrate an aptitude in recognising some essential characteristics of people. However, as this example shows, the program refers to features based on its preconceived notion of human traits, the last two photographs are attributed more verbose, and potentially offensive tags.
It’s no secret that technology is moving at an exceptional pace. In many instances, it progresses at rates which people cannot understand or guarantee control over. As such tech nominally processes thousands upon millions of images bearing the likeness of real people, the concerns range from safeguarding data resources to a popularly-held notion of a potential segue into subjecting the tools toward authoritarian means.
In the United States, for instance, the local San Francisco government has banned the use of facial recognition in law enforcement and transport, citing privacy concerns regarding the security of facial databases.
According to Marcin Skoczylas, there is a potential remedy to privacy concerns. In the case of trained models vs training databases, the former would consist of a programmed system that would be immune to attempts to reverse-read the database of images.
As facial recognition becomes commonplace, there exist pathways for attempts to defraud such systems. So-called ‘Deepfake’ uses AI software to analyse a face and map it onto a video of another subject. As this tech develops beyond its current famous rendition, Zao, by Chinese company Momo, there are fears ‘digital masking’ skill could outpace the understanding of security systems relying on facial recognition.
Image recognition is reportedly the next phase in military drone capabilities. By applying the possibilities of IR, the equipment can identify and flag operational threats and could lead the way to independent judgements based on predefined combat circumstances.
As I have shown above, the implications of such tech are simply-put – substantial. As commerce, and government attempt to grapple the scale of the potential of this technology, ambitious concepts and programming appear to take shape.
Businesses with resources to pilot these schemes are fighting to become first-movers in this sensitive data-driven industry. In this case, enterprises and investors alike should consider waiting for the outcome and impacts of legal and legislative bodies, as they define rules of engagement and necessary safeguards of image recognition.
IR emergence into the mainstream faces challenges due to privacy and security concerns, as well as public moods regarding the protection of civil liberties, and the ‘right to be forgotten’. According to Marcin Skoczylas, the technology falters in performance, because of the time-consuming nature of database gathering, training, and algorithm setting. It is the primary reason that this cannot yet correctly apply in embedded, real-time situations.
In addition to this, with several instances of merit, the application of IR could well-apply to pilot projects. As computer speeds improve, this technology will become more available for everyday use and applications.
The applications’ potential does imply a significant convenience and can unfold to optimise business processes across the entire value chain. The future is digital and has its eye wide open.
Looking for insight and know-how for your next IT project? Get to know our IT services and see, what else can we do for you.